Observability And Analytics

Operate with evidence

Builder Insights needs more than application logs. It needs a clear observability model that tells engineering whether the system is healthy, tells product whether the platform is being used, and tells leadership whether adoption and signal quality are improving.

Observability layers

reliability telemetry for engineering and operators
usage analytics for product and leadership
audit visibility for sensitive actions and access changes
reporting definitions that stay consistent over time

Four layers of visibility

Layer	Audience	Primary question
Reliability telemetry	engineering and operators	is the platform healthy right now?
Product analytics	product and leadership	are people using the system in valuable ways?
Security and audit visibility	access and security-minded admins	who changed access or performed sensitive operations?
Executive reporting	leadership	what trends and adoption signals matter over time?

Recommended instrumentation model

Logs

Every critical request path and privileged workflow should emit structured logs with enough context to support triage.

request and route context
auth and role information where appropriate
error payload summaries
sync and retry state transitions

Metrics

Metrics should make health visible before users tell the team something is broken.

latency and error rates
uptime and request success ratios
sync success and failure counts
queue drain and reconnect behavior

Analytics

Usage analytics should explain whether Builder Relations teams are adopting the workflow and whether the product is creating real operating value.

active users by role and org slice
capture volume by event, team, and time period
offline usage and recovery rates
dashboard and reporting engagement

Audit events

Sensitive actions should be traceable long after the original operator has forgotten they happened.

login and failed-auth events
role changes and entitlement changes
admin-only route access
exports and high-sensitivity operations

Decision tracker

Decision	Current recommendation	Owner	Status
Telemetry stack	approved internal logging, metrics, and tracing stack	Platform Engineering and Application Engineering	Needs decision
Reliability dashboard ownership	engineering and operators	Application Engineering	Drafted
Product usage dashboard ownership	product and leadership-facing analytics owner	Product and Builder Relations Ops	Drafted
Audit-event scope	auth, role changes, privileged routes, exports	Security and Application Engineering	Drafted

Priority dashboards

1. Reliability dashboard

Track auth, sync, API health, and the paths most likely to break user trust first.

2. Usage dashboard

Track adoption, capture volume, and role-based activity so the team can tell whether the product is actually being used.

3. Audit dashboard

Track role changes, privileged actions, and unusual access patterns that matter operationally and defensibly.

4. Leadership dashboard

Track the higher-level patterns that connect product usage to business value and field intelligence outcomes.

Minimum telemetry baseline

QA check

What should exist before broader internal rollout

structured logs for the critical request paths
metrics for auth, sync, queue state, and API health
an explicit definition of core usage metrics
audit events for role changes and sensitive operations
alerting for major reliability regressions

Stakeholder questions to answer

Engineering and platform owners

what logging, metrics, and tracing stack is approved internally?
what alerting thresholds matter for auth, sync, and internal APIs?
how should telemetry from Kanopy-hosted services be collected?

Product and leadership owners

which usage metrics best reflect adoption and value?
how should Builder Relations activity be sliced by role, event, or reporting line?
which dashboards need leadership-ready reporting versus operator-only detail?

Security and compliance-minded owners

which access and privilege events require audit retention?
what export, admin, or user-management actions count as sensitive?

Common failure mode to avoid

Risk

Do not confuse analytics with observability

Usage analytics can tell you whether people are using the platform. They cannot replace the logs, metrics, and alerts needed to keep the platform healthy. Treat reliability telemetry and product analytics as related but distinct layers.

Observability layers

Four layers of visibility​

Recommended instrumentation model​

Decision tracker​

Priority dashboards​

Minimum telemetry baseline​

What should exist before broader internal rollout

Stakeholder questions to answer​

Engineering and platform owners​

Product and leadership owners​

Security and compliance-minded owners​

Common failure mode to avoid​