An introduction to PostgreSQL

DIY Data Types

When none of the built-in data types are enough, you can easily build your own, defining a so-called composite data type. For example, if you define a new type that describes an address:

CREATE TYPE address_type AS (
  street VARCHAR(255),
  city VARCHAR(100),
  state CHAR(2),
  zip_code VARCHAR(10)
);

You can later use it in a table:

CREATE TABLE library_branch (
  branch_id SERIAL PRIMARY KEY,
  branch_name VARCHAR(255) NOT NULL,
  location address_type
);

Inserting is equally simple:

INSERT INTO library_branch (branch_name, location)
VALUES ('Downtown Branch', ROW('123 Main St', 'Anytown', 'NY','12345'));

When querying data, you use dot notation:

SELECT branch_name, location.city AS city
FROM library_branch;

Internally, PostgreSQL treats these types as if they were rows in a table, with the fields of the composite type corresponding to columns in a table. Composite types can be used as column data types in tables, as return types for functions, or as arguments for functions. This provides significant flexibility in passing complex structured data. While you cannot directly create indexes or enforce constraints on individual fields within a composite type when used as a column, you can achieve similar functionality by utilizing functional indexes or constraints on the table where the composite type is used.

Unfortunately, using composite types can introduce overhead compared to using simple types, especially if you frequently need to access or manipulate individual fields within the composite type. Also, while using composite types can make your schema more organized and semantically clear, they can also add complexity, particularly when it comes to writing queries and managing the database schema.

JSONB in a Sample App

Leaving academic discussion aside, I'll move to a practical example. As a web app developer, you may want to know if you can use PostgreSQL to provide persistence for a web app written in JavaScript. You also may be wondering how difficult it is to manage the database connection and how to deploy it. Finally, you may want to know if you can store data in JSON and query it natively. I'll provide a simple example answering some of these questions.

Traditionally, the equivalent of a "Hello, world!" application in front-end development is a simple task management app. Because this is an article on PostgreSQL, I'll concentrate on how JSON is sent by the back end.

Listing 1 contains a barebones Flask app with SQLAlchemy database management, whereas Listings 2 and 3 include the related HTML and vanilla JavaScript files (remember to put these files in the scripts subfolder). The HTML and JavaScript files communicate with the Flask back end listening on localhost on port 5000 using the tasks/ endpoint.

Listing 1

A Sample Flask App

01 from flask import Flask, request, jsonify, send_from_directory
02 from flask_sqlalchemy import SQLAlchemy
03 from flask_cors import CORS
04 from sqlalchemy.dialects.postgresql import JSONB
05
06 app = Flask(__name__, static_url_path='', static_folder='static')
07 CORS(app)  # Initialize CORS with default parameters
08
09 app.config['SQLALCHEMY_DATABASE_URI'] = 'postgresql://taskuser:taskpassword@postgres/taskdb'
10 app.config['SQLALCHEMY_TRACK_MODIFICATIONS'] = False
11 db = SQLAlchemy(app)
12
13 class Task(db.Model):
14   id = db.Column(db.Integer, primary_key=True)
15   name = db.Column(JSONB, nullable=False)  # Change to JSONB
16   parent_id = db.Column(db.Integer, db.ForeignKey('task.id'), nullable=True)
17   subtasks = db.relationship('Task', backref=db.backref('parent', remote_side=[id]), lazy=True)
18
19 with app.app_context():
20   db.create_all()
21
22 @app.route('/tasks', methods=['POST'])
23 def create_task():
24   data = request.get_json()
25   task = Task(name=data['name'], parent_id=data.get('parent_id'))
26   db.session.add(task)
27   db.session.commit()
28   return jsonify({'id': task.id, 'name': task.name, 'parent_id': task.parent_id}), 201
29
30 @app.route('/tasks', methods=['GET'])
31 def get_tasks():
32   tasks = Task.query.all()
33   tasks_list = []
34   for task in tasks:
35     tasks_list.append({
36     'id': task.id,
37     'name': task.name,
38     'parent_id': task.parent_id
39     })
40   return jsonify(tasks_list)
41
42 @app.route('/tasks/<int:task_id>', methods=['GET'])
43 def get_task_with_subtasks(task_id):
44   task = Task.query.get(task_id)
45   if not task:
46     return jsonify({'message': 'Task not found'}), 404
47
48   def serialize_task(task):
49     return {
50     'id': task.id,
51     'name': task.name,
52     'subtasks': [serialize_task(subtask) for subtask in task.subtasks]
53     }
54
55   return jsonify(serialize_task(task))
56
57 @app.route('/')
58 def index():
59   return send_from_directory('static', 'index.html')
60
61 if __name__ == '__main__':
62   app.run(debug=True)

Listing 2

script/index.html

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <title>Task Manager</title>
</head>
<body>
  <h1>Task Manager</h1>
  <div>
    <input type="text" id="taskName" placeholder="Task Name">
    <input type="number" id="parentTaskId" placeholder="Parent Task ID (Optional)">
    <button onclick="createTask()">Create Task</button>
  </div>
  <div id="tasks"></div>
  <script src="script.js"></script>
</body>
</html>

Listing 3

scripts/script.js

01 function createTask() {
02   const name = { text: document.getElementById('taskName').value };
03   const parent_id = document.getElementById('parentTaskId').value;
04
05   fetch('/tasks', {
06     method: 'POST',
07     headers: {
08     'Content-Type': 'application/json',
09     },
10     body: JSON.stringify({ name, parent_id: parent_id || null }),
11   })
12   .then(response => response.json())
13   .then(data => {
14     alert('Task Created: ' + data.name.text);
15     loadTasks();
16   })
17   .catch((error) => {
18     console.error('Error:', error);
19   });
20 }
21
22 function loadTasks() {
23   fetch('/tasks')
24   .then(response => response.json())
25   .then(tasks => {
26     const tasksElement = document.getElementById('tasks');
27     tasks Element.innerHTML = '';
28     tasks.forEach(task => {
29     const taskElement = document.createElement('div');
30     taskElement.textContent = `ID: ${task.id}, Name: ${task.name.text}, Parent ID: ${task.parent_id || 'None'}`;
31     tasksElement.appendChild(taskElement);
32     });
33   });
34 }
35
36 document.addEventListener('DOMContentLoaded', function() {
37   loadTasks();
38 });

In the Flask app in Listing 1, the functions used to create and get tasks return JSON responses. When looking at the code, you may wonder exactly where the tasks are being saved in the PostgreSQL database. In lines 23-27, you see that the create_task() function calls db.session.add() and db.session.commit(). When the database transaction is committed successfully, the task is permanently stored in the PostgreSQL database as defined in the SQLAlchemy configuration. Now, when you type http://localhost:5000 in your browser, you should see the Flask app's main screen where you can add tasks and subtasks (Figure 1) and see the corresponding JSON structure (Figure 2).

Figure 1: The Flask app task manager.

Figure 2: The corresponding JSON structure for the Flask app.

To run the example Flask app, you will need a Dockerfile (Listing 4), a requirements.txt file for Python modules used (Listing 5), and, finally, the docker-compose.yml file with the instructions on how to run the Flask app and PostgreSQL (Listing 6). Listing 6 defines two Docker services (postgres and flaskapp) and one Docker volume (postgress_data) mounted at /var/lib/postgresql/data. Note that I have defined the connection parameters, including the password, directly in the files – in a real use case it should be stored in an appropriate secrets store such as Vault.

Listing 4

A Sample Dockerfile for the Flask App

FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["flask", "run", "--host=0.0.0.0"]

Listing 5

requirements.txt

flask
flask_sqlalchemy
flask-cors
psycopg2-binary

Listing 6

docker-compose.yml

01 services:
02   postgres:
03   image: postgres:latest
04   environment:
05     POSTGRES_USER: taskuser
06     POSTGRES_PASSWORD: task password
07     POSTGRES_DB: taskdb
08   ports:
09     - "5432:5432"
10   volumes:
11     - postgres_data:/var/lib/postgresql/data
12
13   flask app:
14   build: .
15   ports:
16     - "5000:5000"
17   volumes:
18     - .:/app
19   environment:
20     FLASK_APP: app.py
21     FLASK_RUN_HOST: 0.0.0.0
22   depends_on:
23     - postgres
24   command: ["flask", "run", "--host=0.0.0.0"]
25
26 volumes:
27   postgres_data:

This example is interesting because the task data, although simple, is not stored as varchar but as JSONB. You can quickly check it by connecting to the local PostgreSQL instance using the pl sql tool and the URI defined in the Flask app:

psql "postgresql://taskuser:taskpassword@localhost/taskdb"

(Note: Typing the password as shown above is insecure because it can be revealed in your shell history.) After connecting to the database, you can describe the table task (\d task) and query it: Notice that the internal representation is JSONB, and the data displayed by the SELECT statement is JSON (Figure 3).

Figure 3: The table's internal structure is in JSONB, but the data displayed by SELECT is in JSON.

As you can see, creating a simple Flask app using PostgreSQL as an RDBMS with JSON/JSONB as the data type is quite easy. Even though a lot of usual chores related to handling database connections are outsourced to SQLAlchemy to make the code cleaner and shorter, I could have done it manually with relatively few changes. Although this example is by necessity oversimplified, it contains a fully working web app with the data passed between the front end and the database using the Flask back end.

What About Kubernetes?

As PostgreSQL has matured over the years, the support for running it on a Kubernetes cluster has also improved. Generally speaking, when using Kubernetes, you have two distinct options as far as persistent data is concerned: You can either use databases inside the cluster or outside of it. Both have different trade-offs, with running outside the cluster being the traditional and generally more predominant option. However, in recent years, more and more companies have begun using operators to run their databases inside the cluster and are very happy with it. In fact, some users think running PostgreSQL in Kubernetes is now better than running PostgreSQL on virtual machines – thanks to the hard work of many people and projects such as CloudNativePG.

CloudNativePG is a Kubernetes operator using primary/standby architecture and streaming replication. When the primary dies, the operator chooses the new one from among the read replicas. It also readjusts the number of replicas together with all related resources such as persistent volumes (PVs) and persistent volume claims (PVCs), scaling them up and down depending on your requirements. It also performs several other tasks making sure applications connecting to the cluster see it as one coherent and performant database in spite of its content being distributed over several machines. For an excellent case analysis of this type of setup, see Chris Milsted and Gabriele Bartolini's presentation on running PostgreSQL in a Kubernetes cluster [12].

« Previous 1 2 3 4 5 Next »

Buy this article as PDF

Express-Checkout as PDF

Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES

Print Issues

Digital Issues

SUBSCRIPTIONS

Print Subs

Digisubs

TABLET & SMARTPHONE APPS

US / Canada

UK / Australia

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

News

Is This the Year of Linux?

LibreOffice , Linux , Windows

Another major organization has decided to kick Windows and Office to the curb, in favor of Linux.
Linux Mint 20 Reaches EOL

With Linux Mint 20 at its end of life, the time has arrived to upgrade to Linux Mint 22.
TuxCare Announces Support for AlmaLinux 9.2

AlmaLinux , Enterprise Linux , Security

Thanks to TuxCare, AlmaLinux 9.2 (and soon version 9.6) now enjoys years of ongoing patching and compliance.
Go-Based Botnet Attacking IoT Devices

IoT , Security , Systemd

Using an SSH credential brute-force attack, the Go-based PumaBot is exploiting IoT devices everywhere.
Plasma 6.5 Promises Better Memory Optimization

Desktop , Linux , Plasma

With the stable Plasma 6.4 on the horizon, KDE has a few new tricks up its sleeve for Plasma 6.5.
KaOS 2025.05 Officially Qt5 Free

KDE , Linux , Operating Systems

If you're a fan of independent Linux distributions, the team behind KaOS is proud to announce the latest iteration that includes kernel 6.14 and KDE's Plasma 6.3.5.
Linux Kernel 6.15 Now Available

Kernel , Linux , Rust

The latest Linux kernel is now available with several new features/improvements and the usual bug fixes.
Microsoft Makes Surprising WSL Announcement

Linux , open source , Windows

In a move that might surprise some users, Microsoft has made Windows Subsystem for Linux open source.
Red Hat Releases RHEL 10 Early

Enterprise Linux , RHEL , Security

Red Hat quietly rolled out the official release of RHEL 10.0 a bit early.
openSUSE Joins End of 10

Linux , openSUSE , Windows

openSUSE has decided to not only join the End of 10 movement but it also will no longer support the Deepin Desktop Environment.

An introduction to PostgreSQL

DIY Data Types

JSONB in a Sample App

What About Kubernetes?

Buy this article as PDF

Buy Linux Magazine

Related content

Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

News

Is This the Year of Linux?

Linux Mint 20 Reaches EOL

TuxCare Announces Support for AlmaLinux 9.2

Go-Based Botnet Attacking IoT Devices

Plasma 6.5 Promises Better Memory Optimization

KaOS 2025.05 Officially Qt5 Free

Linux Kernel 6.15 Now Available

Microsoft Makes Surprising WSL Announcement

Red Hat Releases RHEL 10 Early

openSUSE Joins End of 10

An introduction to PostgreSQL

DIY Data Types

JSONB in a Sample App

What About Kubernetes?

Buy this article as PDF

Buy Linux Magazine

Related content

Subscribe to our Linux Newsletters Find Linux and Open Source Jobs Subscribe to our ADMIN Newsletters

Support Our Work

News

Tag Cloud

Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters