The CourtListener Database Replication service is for researchers and organizations that need high-speed, granular access to the CourtListener database on an ongoing basis.
This service works by using PostgreSQL to create and maintain a table-based logical replica of our database allocated specifically for you. Once the replica has completed its first sync, you can query your replica of the CourtListener database using standard SQL commands. This provides an incredibly powerful way to access one of the largest open collections of American legal data.
The following types of data are currently available as part of this service:
The FJC Integrated Database civil data
User data will never be shared as part of this service.
If you are interested in this service, please get in touch to learn more. We will be happy to discuss the technical details.
Logical replication is a system supported by recent versions of PostgreSQL. Unlike older methods of database replication, which worked by shipping the database binary files across the network (so‑called physical replication), logical replication works by streaming SQL commands from a "publisher" server to a "subscriber" server.
This form of replication provides a number of benefits over physical replication including the ability to have slightly different versions of the database installed, and the ability to only replicate certain tables from the publisher to the subscriber.
When we engage with you to set up logical replication, we will set up a new server for you in our cloud, and we will provide access to you from a specific IP address in your network.
Once it's set up, your server will always be in sync and we can work together to scale the server however you need.
Logical replication allows for some schema differences between the publisher and the subscriber, but in general, the less you change your schema, the better. Best is if you don't change your schema at all.
On our end, when we make schema changes, we will give you advance notice and we will complete the work according to a pre-determined maintenance schedule.
Whenever your server goes down, ours will keep track of changes until yours comes back online. As soon as your server is back and the subscription is re-established, all changes from your downtime will be synced to your server.
If your server goes offline for an extended period of time, we need to know so that we can make a decision together whether to keep a log of all changes that occurred during your planned downtime.
Keeping these changes takes up space on our hard drive. If your server is down for too long, eventually we have to stop saving the information for you and a complete re-synchronization will be needed. We charge a fee for re-synchronization work.
Yes, we monitor replication around the clock using our own tools and those from our provider, AWS. Some of our internal monitoring is available here.
If you have more questions, please send us an email. This service is a complicated one and we look forward to working with organizations and researchers to make it a success.