Mapping Dallas Police Activity in Real-Time

Mapping Dallas Police Activity in Real-Time

The city of Dallas provides a real-time snapshot of active police calls, where officers are currently assigned and responding to the call. This data updates every two minutes and contains valuable location information, either as an address or cross streets. To visualize these calls, we need to transform the provided location details into geographic coordinates for mapping. The immediate goals of this project are to:

  1. Capture real-time active calls from the Dallas Open Data API
  2. Track changes between API calls, integrating new, updated, and removed records into a local dataset.
  3. Enrich call data by forward geocoding location information.
  4. Persist updates to Amazon S3 for historical reference.
  5. Publish these updates via an API endpoint using Server-Sent Events (SSE).

This will set the foundation for a real time dashboard showing current active calls on a map as a core feature. Additional features could include summary statistics, and a call details panel. The dashboard would stay current by receiving notifications of new, updated and removed calls from Server-Sent Events.

The Data Pipeline:

  1. A lambda function downloads the current snapshot of active calls from Dallas Open Data, and saves the raw json to an s3 bucket based on an eventbridge schedule.
  2. The downloaded file emits an s3:ObjectCreated:Put event.
  3. The s3 event is published to an SNS topic.
  4. The SNS topic feeds the event to parallel processes that track call changes and forward geocode location data.
  5. Location information and active calls data are written to separate dynamodb tables.
  6. Dynamodb streams send updates to a lambda function.
  7. The lambda function receiving streamed updates persists changes to s3, and publishes address data and call updates as Server-Sent events to an API.

The API prototype was built using python's FastAPI framework. It runs in a docker container hosted on AWS' Elastic Container Service (ECS). Its methods allow fetching the most recent active calls with enriched location details, and streaming real-time updates to the client. When a new client connects, it will download all current calls, and then subscribe to /get-events/ to receive streaming updates for calls and address information.

  1. Event Queue for Real-Time Updates:
    • An event queue manages incoming events.
    • The event_publisher function continuously yields events from the queue, which are streamed to connected clients via the /get-events/ endpoint using the SSE protocol.
  2. Event Publishing:
    • The /events/ POST endpoint receives JSON payloads from dynamodb streams and puts them into the event queue for processing.
    • Clients subscribe to updates by connecting to the /get-events/ endpoint, which streams events using the StreamingResponse class.
  3. Current Calls Retrieval:
    • The /current-calls/ endpoint retrieves current active calls from two dynamodb tables:
      • dpd_active_calls for call data.
      • address_cache for associated addresses.
    • The endpoint returns the latest updates for each call or newly processed address data
  4. DynamoDB Operations:
    • The scan() method is used to fetch all records from both tables.
    • Call records are checked for the most recent updates based on the update_date field, ensuring only the latest data is returned.

Next steps will be to:

  • Refine the forward geocoding process. It currently uses the Radar API to retrieve lattitude and longitude, but roughly 30% of address information is obviously incorrect. I may need to incorporate multiple geocoding APIs and determine choose the best result to use as the actual address.
  • Refactor terraform into modules. Terraform code is currently broken into files by service and function. Refactoring into separate modules would improve readability and maintainability.
  • Continue work on the api.
    • We might be able to improve response time by using a query instead of scanning the active calls and address tables
    • Implement size limits on the event_queue
    • Implement a caching mechanism to reduce dynamodb queries
    • Add authentication and API rate limiting to prevent malicious activity
  • Build a front end dashboard.
    • A user-friendly dashboard will display real-time updates with a map and a call details panel.