By definition, living in an information age means we have an awful lot of data. And in order for us to use that data, it needs to be organised. Enter the database. Database management systems (DBMS) are used in virtually every business but they can vary from super simple and generic to highly complex and bespoke.
When it comes to choosing a database system, you’ll need to start by asking yourself lots of questions to help you understand the features that will be needed from the DBMS:
- What will you be storing? Large documents? Lots of related records? Transactional data? Strictly or loosely structured? There’s a big difference between wanting to store a bunch of PDF files that are generated by a survey compared to developing an e-commerce site and needing to store multiple orders, complex stock data and so on.
- How hard will the system need to work? Think about whether you have any hard performance challenges such as high volumes of data moving through the system, complicated queries or large numbers of people using the system at the same time. A business with a fleet of self-driving cars that send information to a server every 20 seconds will necessarily need a very robust DBMS. Ditto an e-commerce website used by millions every day.
- Are you using anything already? If you’ve already got a system you could just re-implement then this could be more cost effective than looking for a new one. Think about what expertise you have in-house too - you may not want to outsource your software development and maintenance, so if you have people who are familiar with one system then it could be best to stick to that.
- How do you expect to consume the data? First think about the kind of software that will be talking to the database. Is it a web application? Do you need reporting capabilities? Also think about whether you might expect it to deal with any sophisticated queries. If you want to do full geospatial searching (e.g. ‘show me all of the shop locations within 20 miles of Swindon’) you’re going to need something very different to what you’d need if you were just asking something like ‘Pull up the file of the user with ID 278’.
Now you’ve got a clear idea of what will be required from your database system, you can start to think about the type of database that will suit you best. In general you’ll need to choose between either self-hosted or DBaaS and between Relational (ie SQL) or noSQL. Here’s what you need to know.
Self-hosted vs DBaaS database systems
There are many advantages of a hosted database system or Database as a Service (DBaaS):
- The database software will be pre-tuned to the hardware it is running on by the supplier.
- There is less up-front financial commitment.
- There is usually flexibility to scale up or down at short notice.
- Configuration of backups, scaling, replication, upgrades and security patching is either drastically simplified or fully automated.
- You’ll need less expertise in-house and, with more support, there’s less opportunity for making mistakes.
However there are times when it won’t be the right choice for you:
- If you already have an infrastructure that you want to make use of.
- If you need to customise your database in ways that are not supported by your DBaaS provider. While many systems will have plugins or extensions to allow you to add extra functionality, there may be some new ones that aren’t supported.
- If you have to keep your data on site for compliance.
It’s worth noting that even if you use a hosted system, you’ll still need to have someone within your team who understands how best to architect the systems and get the most out of them. What you’re paying for with DBaaS is a blank space, you still have to have the expertise to structure what you put into that database, how you design your queries and so on.
Relational (SQL) Database vs NoSQL
Relational databases have been around since the 1970s and essentially store tables of structured data that can be related to one another - things like customers, orders, stock items and so on. They have what is known as ACID properties - in other words, their strengths include Atomicity, Consistency, Isolation and Durability.
Relational databases support sophisticated queries and updates with SQL (Structured Query Language) which is the big standard for working with databases. While simple queries are mostly portable between database systems, in practice application code will often end up being at least loosely coupled to a particular SQL implementation. The exception is where you’re using intermediate software which does the actual SQL generation.
All SQLs are similar but some might have more advanced applications than others.
When it comes to choosing a relational DBMS, there are a couple of things you’ll need to consider:
- Advanced features: For example natural language searching or geospatial data. These fall outside the SQL standards and often differ significantly between systems in terms of functionality. So if there’s something specific you’re looking for, this might help you narrow down your list.
- Cost: Relational database systems range from virtually free open source products like PostgreSQL and MySQL to potentially very expensive ones such as SQL Server and Oracle. The proprietary database systems generally have different tiers of license and a cost that will scale further depending on how beefy your database servers are.
NoSQL database systems are often specialised to address a particular use case that a SQL database would struggle with, for example large unstructured documents or massive data throughput. They can also reduce complexity when most RDBMS features would be superfluous, for example in simple key-value stores.
Using a noSQL often comes at the cost of sacrificing some of the ACID guarantees we mentioned in relation to SQL systems. But there is a compromise - sometimes it’s possible to use a SQL system in a way that is more like a noSQL one, for example by storing XML or JSON documents in a table where appropriate. On the other hand, it’s virtually impossible to use a NoSQL system in a SQL-like way.
If you’re looking for a new database, you will need to spend some time thinking about your needs, the problems you’re trying to solve, the data you’ll be storing and how you’re using it. All of these factors will influence what DBMS you end up going for. Our opinion? A hosted SQL is always a good first choice unless you can definitely prove it won’t meet your needs.