Schema#
After SUMO produces XML output, AgentSUMO’s xml_to_sqlite_tool parses it
into a relational SQLite database (simulations.db) — the structured
representation of simulation results. This page documents that schema so
custom SQL workflows can sit alongside the agent’s natural-language access.
The schema is defined in
agentsumo_mcp/analysis/xml_to_sqlite.py
(_create_schema). All six tables are keyed by a simulation_id so
multiple runs coexist in the same file and can be compared through
ordinary SQL JOIN.
SQLite integration pipeline of AgentSUMO. The xml_to_sqlite_tool
converts SUMO XML outputs into a structured database of six linked
tables keyed by simulation_id. The SQLite MCP Server then exposes
query, schema inspection, and analysis tools, enabling the Planner
Agent to answer natural-language questions about simulation results
through SQL.#
Entity-relationship diagram#
erDiagram
simulations ||--o{ trips : "simulation_id"
simulations ||--o{ vehicle_info : "simulation_id"
simulations ||--o{ edge_info : "simulation_id"
simulations ||--o{ edge_metrics : "simulation_id"
simulations ||--o{ network_state : "simulation_id"
vehicle_info }o--|| trips : "vehicle_id = trip_id"
edge_info ||--o{ edge_metrics : "edge_id"
simulations {
TEXT simulation_id PK
TEXT created_at
INTEGER vehicle_count
TEXT net_file
TEXT route_file
TEXT description
}
trips {
TEXT simulation_id PK
TEXT trip_id PK
REAL duration
REAL waitingTime
REAL timeLoss
REAL CO2_abs
REAL fuel_abs
}
vehicle_info {
TEXT simulation_id PK
TEXT vehicle_id PK
TEXT vehicle_type
TEXT fuel_type
TEXT origin_road
TEXT destination_road
}
edge_info {
TEXT simulation_id PK
TEXT edge_id PK
TEXT road_name
REAL length
INTEGER num_lanes
REAL speed_limit
}
edge_metrics {
TEXT simulation_id PK
TEXT edge_id PK
REAL interval_begin PK
REAL density
REAL speed
REAL waitingTime
REAL CO2_abs
}
network_state {
TEXT simulation_id PK
REAL time PK
INTEGER running
INTEGER halting
REAL meanSpeed
REAL meanTravelTime
REAL meanWaitingTime
}
Tables at a glance#
Table |
Granularity |
Source |
Rows per simulation |
|---|---|---|---|
|
Run |
|
1 |
|
Vehicle trip |
|
~ vehicle count |
|
Vehicle |
|
~ vehicle count |
|
Road segment |
SUMO |
~ edge count |
|
Edge × time interval |
|
~ edges × intervals |
|
Simulation time step |
|
~ duration / step |
Tables#
simulations#
Run-level metadata. One row per xml_to_sqlite_tool invocation.
Column |
Type |
Description |
|---|---|---|
|
TEXT (PK) |
Caller-provided unique identifier (e.g., |
|
TEXT |
ISO-8601 timestamp at ingest. |
|
INTEGER |
Total trips ingested from |
|
TEXT |
Path to the SUMO network used for this run. |
|
TEXT |
Path to the route file used for this run. |
|
TEXT |
Free-text label provided by the caller. |
Primary key: simulation_id.
trips#
One row per vehicle trip, populated from tripinfo.xml. Captures both trip
dynamics and per-vehicle emission totals (HBEFA-based).
Column |
Type |
Description |
|---|---|---|
|
TEXT |
FK → |
|
TEXT |
SUMO vehicle ID. |
|
REAL |
Departure time in seconds. |
|
TEXT |
Lane the vehicle departed from. |
|
REAL |
Lateral departure position. |
|
REAL |
Departure speed (m/s). |
|
REAL |
Delay vs. scheduled departure (s). |
|
REAL |
Arrival time in seconds. |
|
TEXT |
Lane on arrival. |
|
REAL |
Lateral arrival position. |
|
REAL |
Arrival speed (m/s). |
|
REAL |
Total trip duration (s). |
|
REAL |
Distance traveled (m). |
|
REAL |
Cumulative time spent waiting (s). |
|
INTEGER |
Number of waiting events. |
|
REAL |
Cumulative time at scheduled stops (s). |
|
REAL |
Time lost vs. unimpeded travel (s). |
|
INTEGER |
Number of reroutes during the trip. |
|
TEXT |
Comma-separated SUMO device list. |
|
TEXT |
Vehicle type ID. |
|
REAL |
Speed factor applied to this vehicle. |
|
TEXT |
Set if the vehicle vaporized (failed to insert / removed). |
|
REAL |
Total CO emissions (mg). |
|
REAL |
Total CO₂ emissions (mg). |
|
REAL |
Total hydrocarbons (mg). |
|
REAL |
Total particulate matter (mg). |
|
REAL |
Total NOₓ (mg). |
|
REAL |
Total fuel consumed (mL). |
|
REAL |
Total electricity consumed (Wh). |
Primary key: (simulation_id, trip_id).
vehicle_info#
Per-vehicle metadata derived from the route file and vType definitions. Lets queries group results by fuel type or filter by origin/destination road without re-parsing the route XML.
Column |
Type |
Description |
|---|---|---|
|
TEXT |
FK → |
|
TEXT |
Joins to |
|
TEXT |
vType ID (e.g., |
|
TEXT |
Derived classification: |
|
TEXT |
First edge in the vehicle’s route. |
|
TEXT |
Last edge in the vehicle’s route. |
|
TEXT |
Human-readable name of |
|
TEXT |
Human-readable name of |
Primary key: (simulation_id, vehicle_id).
edge_info#
Per-edge static attributes pulled from the SUMO network via sumolib. The
table holds the structural information about each edge — the dynamic
metrics live in edge_metrics.
Column |
Type |
Description |
|---|---|---|
|
TEXT |
FK → |
|
TEXT |
SUMO edge ID. |
|
TEXT |
Human-readable name (e.g., |
|
REAL |
Edge length in meters. |
|
INTEGER |
Number of lanes on this edge. |
|
REAL |
Posted speed limit (m/s). |
Primary key: (simulation_id, edge_id).
edge_metrics#
Per-edge per-interval traffic and emission metrics, populated from
edgeData.xml (traffic) and edgeData_emission.xml (emissions). The
emission columns come in three normalizations:
*_abs— total over the interval.*_normed— length-and-time normalized.*_perVeh— per-vehicle on the edge during the interval.
Column |
Type |
Description |
|---|---|---|
|
TEXT |
FK → |
|
TEXT |
Joins to |
|
REAL |
Interval start (s). |
|
REAL |
Interval end (s). |
|
REAL |
Total vehicle-seconds sampled. |
|
REAL |
Average travel time on the edge. |
|
REAL |
Travel time including overlap. |
|
REAL |
Vehicles per km. |
|
REAL |
Vehicles per km per lane. |
|
REAL |
Fraction of the edge occupied. |
|
REAL |
Total waiting time on the edge. |
|
REAL |
Total time loss on the edge. |
|
REAL |
Average speed (m/s). |
|
REAL |
|
|
INTEGER |
Vehicles that departed from this edge. |
|
INTEGER |
Vehicles that arrived at this edge. |
|
INTEGER |
Vehicles that entered the edge. |
|
INTEGER |
Vehicles that left the edge. |
|
INTEGER |
Lane-change exits. |
|
INTEGER |
Lane-change entries. |
|
REAL |
Absolute emissions over the interval (7 species). |
|
REAL |
Normalized emissions (7 species). |
|
REAL |
Per-vehicle emissions (7 species). |
Primary key: (simulation_id, edge_id, interval_begin).
network_state#
Network-wide per-step state extracted from summary.xml. Records cumulative
vehicle counts (loaded, inserted, running, waiting, ended, arrived, halting)
together with mean speed, travel time, and waiting time across all active
vehicles at each simulation step. Supports temporal analyses such as
congestion-onset detection and peak-halting comparison that vehicle-level
aggregates cannot resolve.
Column |
Type |
Description |
|---|---|---|
|
TEXT |
FK → |
|
REAL |
Simulation time step (s). |
|
INTEGER |
Cumulative vehicles parsed from the route file. |
|
INTEGER |
Cumulative vehicles that entered the network. |
|
INTEGER |
Vehicles currently in the network at time t. |
|
INTEGER |
Vehicles waiting to be inserted at time t. |
|
INTEGER |
Cumulative vehicles that left the network (arrived + vaporized). |
|
INTEGER |
Cumulative vehicles that reached their destination. |
|
INTEGER |
Vehicles currently stopped (speed = 0). |
|
REAL |
Mean speed of running vehicles (m/s). |
|
REAL |
Mean of |
|
REAL |
Mean travel time of running vehicles (s). |
|
REAL |
Mean waiting time of running vehicles (s). |
|
INTEGER |
Cumulative SUMO teleport events. |
|
INTEGER |
Cumulative collision events. |
Primary key: (simulation_id, time).
Indexes#
The schema creates indexes on the most common access patterns:
(simulation_id, edge_id) on edge_metrics; road_name and length on
edge_info; fuel_type, origin_road, destination_road on
vehicle_info; and simulation_id on network_state. These keep
cross-scenario joins responsive even with tens of millions of rows.
Example queries#
Canonical SQL snippets you can paste into any SQLite client (or ask the agent to run for you in natural language).
Top-N slowest edges in one run#
1SELECT e.road_name, em.edge_id, AVG(em.speed) AS avg_speed
2FROM edge_metrics em
3JOIN edge_info e
4 ON e.simulation_id = em.simulation_id
5 AND e.edge_id = em.edge_id
6WHERE em.simulation_id = 'baseline'
7GROUP BY em.edge_id
8ORDER BY avg_speed ASC
9LIMIT 10;
Per-vehicle CO₂ emissions for a fleet sub-group#
SELECT v.fuel_type, SUM(t.CO2_abs) AS total_co2_mg, COUNT(*) AS trips
FROM trips t
JOIN vehicle_info v
ON v.simulation_id = t.simulation_id
AND v.vehicle_id = t.trip_id
WHERE t.simulation_id = 'ev_75pct'
GROUP BY v.fuel_type
ORDER BY total_co2_mg DESC;
Departure-delay distribution#
SELECT
CASE
WHEN departDelay < 0 THEN 'early'
WHEN departDelay = 0 THEN 'on time'
WHEN departDelay < 60 THEN 'under 1 min'
WHEN departDelay < 300 THEN '1–5 min'
ELSE '5+ min'
END AS bucket,
COUNT(*) AS trips
FROM trips
WHERE simulation_id = 'baseline'
GROUP BY bucket
ORDER BY trips DESC;
Origin-destination pair counts#
SELECT origin_road, destination_road, COUNT(*) AS trips
FROM vehicle_info
WHERE simulation_id = 'baseline'
GROUP BY origin_road, destination_road
ORDER BY trips DESC
LIMIT 20;
Side-by-side comparison of two simulations#
SELECT
AVG(CASE WHEN simulation_id = 'baseline' THEN duration END) AS avg_dur_baseline_s,
AVG(CASE WHEN simulation_id = 'teheran_lane_reduction' THEN duration END) AS avg_dur_policy_s,
AVG(CASE WHEN simulation_id = 'baseline' THEN waitingTime END) AS avg_wait_baseline_s,
AVG(CASE WHEN simulation_id = 'teheran_lane_reduction' THEN waitingTime END) AS avg_wait_policy_s,
SUM(CASE WHEN simulation_id = 'baseline' THEN CO2_abs END) AS co2_baseline_mg,
SUM(CASE WHEN simulation_id = 'teheran_lane_reduction' THEN CO2_abs END) AS co2_policy_mg
FROM trips
WHERE simulation_id IN ('baseline', 'teheran_lane_reduction');
Time loss aggregated by road name#
SELECT e.road_name, SUM(em.timeLoss) AS total_loss_s
FROM edge_metrics em
JOIN edge_info e
ON e.simulation_id = em.simulation_id
AND e.edge_id = em.edge_id
WHERE em.simulation_id = 'baseline'
GROUP BY e.road_name
HAVING total_loss_s > 0
ORDER BY total_loss_s DESC
LIMIT 15;
Top emitters by fuel type#
SELECT v.fuel_type,
SUM(t.NOx_abs) AS nox_mg,
SUM(t.PMx_abs) AS pmx_mg,
COUNT(*) AS trips
FROM trips t
JOIN vehicle_info v
ON v.simulation_id = t.simulation_id
AND v.vehicle_id = t.trip_id
WHERE t.simulation_id = 'baseline'
GROUP BY v.fuel_type
ORDER BY nox_mg DESC;
Top 5% busiest edges (length-weighted)#
1WITH per_edge AS (
2 SELECT em.edge_id,
3 e.road_name,
4 e.length,
5 AVG(em.density) AS mean_density
6 FROM edge_metrics em
7 JOIN edge_info e
8 ON e.simulation_id = em.simulation_id
9 AND e.edge_id = em.edge_id
10 WHERE em.simulation_id = 'baseline'
11 GROUP BY em.edge_id
12)
13SELECT *
14FROM per_edge
15ORDER BY mean_density DESC
16LIMIT (SELECT COUNT(*) / 20 FROM per_edge);
See also
Result Analysis Tools —
xml_to_sqlite_toolpopulates this schema;simulation_report_toolrenders an HTML report from it.SQLite MCP Server — how the Planner Agent queries this database in natural language.