Relational databases

  • Tackling performance problems with more hardware usually just delays problems

  • Instead of just provisioning bigger database instance, one should try to optimize database schema and queries

    • normalizing data is usually the first thing to do and check

      • normal forms

    • in schema design, one should avoid data duplication

  • Primary keys can be also short strings instead of integers

    • with this, some information can be retained

  • PostgreSQL has auto compress feature (TOAST)

Using EXPLAIN

  • shows the query plan

  • using EXPLAIN ANALYZE actually executes the statement and run time statistics are added to display

Optimizing performance

  • Sequential scans are usually bad for performance

Indices

  • To simplify, indices in tables turn sequential scans to index scans

  • Most of the regularly used queries should have index

    • however the come with a cost

      • especially when writing to a table

  • What if I have sequential scans even after adding an index

    • small table (not lots of rows)

    • on multi field indices, fields can be omitted only from the end

    • using function in where clause

      • however function can be used in index itself

  • Postgres specific reasons why indices won't work

    • VACUUM hasn't been run, so table statistics are not updated

Object-relational mappings (ORM)

  • Usually it's a good thing to write simple code with abstractions

    • however, abstractions might have a performance cost

  • Using ORM is usually recommend, but when performance is critical it could be better to write pure SQL

Sharing responsibilities between application code and database

  • Let the database do the heavy lifting

    • when database works, app sleeps

    • the is overhead going back and forth between application and database

  • Prefer small number of big queries instead of big number of small queries

Resources

References

  • My own notes and interpretations on my colleagues excellent presentation with title "Your Customer Might Not Need a Bigger Database Instances" (17.01.2025)

Last updated