Skip to content

SQL Agent

SQL Agent is an application that can be deployed to produce and maintain a replica of the system of record data in a Microsoft SQL Server database. It subscribes to the CDC topic, consumes change messages, and reproduces those changes in matching tables in the database.

The SQL Agent is a .NET 10 application and can be deployed on Windows and Linux.

Startup

When SQLAgent starts, it first bootstraps its configuration, opens its local state database, restores its protected secrets, and brings up its TLS-protected management endpoint. If ConfigServer is configured, it registers if necessary, authenticates, downloads the profile configuration, and obtains a TLS certificate if one is not already installed. If ConfigServer is not being used, or cannot be reached, it falls back to locally cached or manually maintained configuration files. Startup only continues if configuration and TLS material are available.

After bootstrap, SQLAgent validates that the replay environment is actually usable. It loads the configured SQL helper assembly, checks that helper classes exist for all enabled tables, initializes its multi-record helper, verifies Kafka access, verifies S3 access when remote snapshot storage is configured, connects to SQL Server, checks that the target server is SQL Server 2022 or later, checks that PolyBase is installed and available, opens the database master key, starts a database transaction, and restores the last seeded snapshot information from persistent state for status reporting. It also creates a PID file and starts its local state synchronization task.

From there, startup follows one of several paths. If the SQL replica already has all expected tables and the agent has a valid persisted replay offset, it simply resets its CDC consumer to the saved replay position and begins normal processing. If startup table checking is enabled and missing tables are found, the agent switches into recovery. If all replica tables are missing, it treats that situation as a full restore: it uses the newest Parquet snapshot it can find (or requests a new snapshot if necessary), restores the tables, records the snapshot as the new seed source, and catches up from the snapshot’s CDC offset to the current end of the CDC stream. If only some tables are missing, it performs a selective restore instead. In that case, it must already have a valid persisted replay offset, and it searches for a Parquet snapshot at or before the offset that contains all of the missing tables. It then restores only those tables and catches them up from the snapshot’s offset to the agent’s saved replay position. If no safe resume point exists, or no suitable snapshot can be found for a selective restore, startup stops rather than risking an inconsistent SQL replica.

Main Processing

Once SQLAgent enters its main loop, it continuously polls the configured CDC topic. If no work is available, it commits any open database batch, records the idle period, sleeps for the configured inactivity interval, and then polls again. If the agent is paused, it first commits the current batch, changes state to Paused, and waits. When it is resumed, it changes to Resuming, rechecks the database for missing tables, and only then returns to normal CDC processing.

For each CDC message, SQLAgent first decides whether the message should be processed, ignored, or suppressed. Normal insert, update, and delete change messages are converted into row operations against the matching SQL table. Application-controlled transaction messages such as start and commit are not replayed to SQL; they are logged and skipped, and the replay position is still advanced so processing can continue. Invalid or unsupported messages are also logged and skipped. For data-bearing messages, the agent identifies the target table, ensures the table metadata is known, and then applies the change through the appropriate SQL helper. Inserts that find an existing row automatically switch to an update. Updates that cannot find the target row automatically switch to an insert. Delete failures and other SQL-helper errors are logged, and depending on configuration, the agent either continues or stops.

SQLAgent applies changes in batches inside a SQL transaction rather than committing every message immediately. A batch is committed when it reaches the configured batch size, when a batch timeout expires, when the agent becomes idle, or before and after special operations such as pause handling or snapshot-based repair. On each batch commit, the SQL transaction is committed, the agent’s locally persisted replay offset is advanced to the next CDC offset, and then the Kafka consumer position is committed. In practice, the agent resumes from that locally persisted replay state on restart, so the saved SQL replay position, not just Kafka’s consumer-group state, is the authoritative resume point.

Shutdown

When SQLAgent is run as a Windows Service, normal shutdown should be initiated through Service Control Manager. When it is run under Linux service management such as systemd, it should be stopped through the service manager, for example, with systemctl stop. The same controlled shutdown path is used when an administrator selects Stop in the management UI or sends POST /api/stop to the management API. If a developer starts the agent manually in a console, pressing Ctrl+C triggers the same shutdown flow.

During shutdown, the host is told to stop and the replay loop exits cleanly. SQLAgent then changes its visible state to Stopping, stops its batch timeout timer, commits any uncommitted SQL batch, closes the SQL connection, closes the CDC consumer, and deletes its PID file. That means a graceful shutdown does perform a final database commit of in-flight batched work before the process exits. On the next startup, the agent resumes from the last replay position that was committed into persistent state. If the process is not shut down cleanly, any work that had not yet been committed as a batch can be replayed after restart.

Management UI

SQLAgent includes a secure browser-based management UI for day-to-day operations. The main status page is the operator dashboard. It shows the current agent status, environment, operation counts, error counts, last activity time, current SQL batch depth, commit counters, database connectivity, Kafka topics and current CDC offset, and information about the snapshot the replica was last seeded from. That page also provides Pause, Resume, and Stop controls.

The log page shows recent log output and live log updates through the authenticated SignalR log stream, with controls to pause the live view, choose how many entries to keep visible, toggle color-coded severity, clear the browser view, and download the visible log text. The activity page shows recent CDC traffic from Kafka; supports refresh, search, ordering, and download; and allows record payloads to be opened for inspection. When both current and original payloads are present, the UI can highlight byte-level differences between them.

The UI also exposes several SQLAgent-specific operational pages. The database pages show which configured tables exist, along with row, column, and index counts, and provide management actions such as create, drop, truncate, add indexes, drop indexes, export to CSV, export to sequential format, and restore missing tables. Those database-management actions are only available while the agent is paused. The snapshots pages list known snapshots, show storage sizes, allow a new snapshot to be requested, and show snapshot metadata and file-level detail. Requesting a new snapshot only asks the upstream snapshot process to create and publish one; it does not by itself cause SQLAgent to apply that snapshot. Administrative pages allow the admin password to be changed, logging levels to be adjusted by namespace, the encryption key ring to be exported for secure backup, and, when enabled by configuration, an authenticated host terminal to be opened in the browser. A table-schema page exists in the UI, but it is currently only a placeholder.

Most of the functionality exposed in the management UI is implemented by authenticated calls to the agent’s API. That means the built-in UI is only one client of the management surface, and the same protected interfaces can be used to build custom dashboards, operational tooling, and monitoring systems.

Management API

The application exposes a management API with endpoints that can be categorized in several areas:

  • Authentication
  • Agent control
  • Agent status
  • Log access
  • Kafka inspection
  • Database administration
  • Snapshot inspection
  • Snapshot requests
  • Encryption-key export

Security

The API is protected in layers. It runs on the agent’s TLS-enabled management endpoint, and the agent will not start without the TLS certificate and password it needs for that endpoint. The admin password is stored encrypted in the local state database using ASP.NET Core Data Protection; on Windows, the key ring can also be protected to the local machine with DPAPI. A successful login returns a short-lived JWT whose lifetime is controlled by the SQLAgent configuration, and the same login also sets a Secure, HttpOnly, SameSite=Strict browser cookie. Protected routes accept either the bearer token or that secure cookie. Browser requests for protected HTML are redirected to the sign-in page, while API clients receive a 401 Unauthorized error. The authenticated SignalR log stream and the optional /ssh host-terminal WebSocket are protected by the same authorization layer. The host terminal adds a second layer of protection because, after the admin session is established, it still requires valid local host SSH credentials before opening a shell on localhost. LaunchPad integration is limited to localhost and uses a short-lived, one-time login ticket. If Swagger UI is enabled, it can be used to explore the API, but protected operations still require a valid admin token.

One important operational rule is that database-altering API functions are designed to be used only while the agent is paused. The replay service enforces that requirement before it allows table creation, deletion, truncation, index management, export, or missing-table restore to proceed.

Configuration

SQL Agent supports both ConfigServer and local-file bootstrap. It looks for a ConfigServer URL and profile from command-line options, then environment variables, and finally local app settings. If ConfigServer is configured, it registers if needed, authenticates, downloads MasterConfig.json and ServerConfig.SQLAgent.json, and requests a TLS certificate if one is not already present locally. Successful downloads are cached in the local profile directory so the agent can start later from a last-known-good local copy if ConfigServer is unavailable.

If ConfigServer is not being used, SQLAgent loads those same configuration files directly from the local profile directory. On Windows that directory is %ProgramData%\Synergex\SIP\<profile>, and on Linux it's /etc/synergex/sip/<profile>. Those files can be cached copies from a previous ConfigServer startup or manually maintained local files.

If ConfigServer is configured but unavailable, the agent falls back to those local copies rather than failing immediately. Startup only stops when the required local configuration files or TLS materials are missing or unusable.

Periodic Services

SQL Agent supports Periodic Services which allows you to configure one or more recurring background tasks that run on a cron-style schedule alongside normal agent processing.