Skip to content

Update to schema v31#26

Open
frisi wants to merge 6 commits into
acoustid:mainfrom
molindo:MD-131-schema-update-v31
Open

Update to schema v31#26
frisi wants to merge 6 commits into
acoustid:mainfrom
molindo:MD-131-schema-update-v31

Conversation

@frisi
Copy link
Copy Markdown

@frisi frisi commented May 21, 2026

Harald Friessnegger added 6 commits May 26, 2025 14:06
note that this reverts the adaptions done by acoustid#8

as musicbrainz does not ship the scripts with function definitions
containing a `SET schema` local adaptions would always lead to merge
problems.

if users have problems that functions do not use the correct schema for
queries and can't find functions or tables, consider to set the schema on session,
user or database level:

```
-- session
SET search_path to musicbrainz,public;

-- user level
ALTER USER musicbrainz SET search_path TO musicbrainz, public;

-- check user level settings via psql
\drds mbdata;

-- check user level via query
SELECT
    r.rolname AS role_name,
    rs.setconfig AS configured_settings
FROM
    pg_roles r
LEFT JOIN
    pg_db_role_setting rs ON r.oid = rs.setrole
WHERE
    r.rolname = 'musicbrainz';

-- db level
SELECT setting FROM pg_settings WHERE name = 'search_path';
ALTER DATABASE database_name SET search_path TO musicbrainz, public;
```
musicbrainz has fixed upgrade issues
see metabrainz/musicbrainz-server@c7b8c29
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 21, 2026

Walkthrough

This PR updates the MusicBrainz database schema from version 29.1.0 to 31.0.0. It refactors the dbmirror2 replication system to use direct row JSON serialization instead of dynamic column lookup, adds Medium GID support, partitions artist_release and artist_release_group tables by is_track_artist, introduces editor data sanitization for privacy, and extends schema support for series relationships and release_label uniqueness constraints.

Changes

Core Database Migration and Replication Redesign

Layer / File(s) Summary
Version and documentation updates
CHANGELOG.rst, README.rst, mbslave/__init__.py
Version bumped to 31.0.0; changelog and README updated with upgrade instructions for versions 30 and 31, including search_path configuration guidance.
Replication configuration and script organization
mbslave/replication.py, mbslave/sql/CreateAllReplicationTriggers2.sql, mbslave/sql/DropAllReplicationTriggers2.sql
Replication setup now points to unified dbmirror2.sql file instead of separate files; new helper scripts aggregate replication trigger creation/deletion across modules (caa, eaa, documentation, statistics, wikidocs).
Core dbmirror2 replication mechanism redesign
mbslave/sql/dbmirror2/README, mbslave/sql/dbmirror2/dbmirror2.sql
Removes column_info materialized view and dynamic JSON construction logic; replaces with row_to_json(OLD) and row_to_json(NEW) for direct serialization; adds pending_keys, pending_ts, and pending_data table definitions with out-of-order trigger detection via ctid matching and seqid shifting.
Medium GID support with redirect table
mbslave/sql/CreateTables.sql, mbslave/sql/CreateIndexes.sql, mbslave/sql/CreatePrimaryKeys.sql, mbslave/sql/CreateFKConstraints.sql, mbslave/sql/DropTables.sql, mbslave/sql/DropIndexes.sql, mbslave/sql/DropPrimaryKeys.sql, mbslave/sql/DropFKConstraints.sql, mbslave/sql/InsertTestData.sql
Adds gid UUID column to medium table; creates medium_gid_redirect lookup table with primary key on gid and foreign key on new_id; adds unique index on medium.gid and index on medium_gid_redirect.new_id.
Artist release and release group table partitioning
mbslave/sql/CreateTables.sql, mbslave/sql/CreateIndexes.sql
Replaces sort_character columns with name columns (VARCHAR COLLATE musicbrainz); adds primary_type_child_order, primary_type, and secondary_type_child_orders fields to artist_release_group; partitions both tables by is_track_artist with nonva/va partition pairs.
PL/pgSQL trigger and function refactoring
mbslave/sql/CreateFunctions.sql, mbslave/sql/CreateMirrorOnlyFunctions.sql
Removes SET search_path = musicbrainz, public clauses from function language declarations across ~40 trigger and utility functions; updates delete_unused_url to use simpler deletion logic with foreign_key_violation exception handler; adjusts get_artist_release_rows and get_artist_release_group_rows SELECT/join structure.
Medium and release group type update triggers
mbslave/sql/CreateTriggers.sql, mbslave/sql/CreateMirrorOnlyTriggers.sql, mbslave/sql/DropTriggers.sql, mbslave/sql/DropMirrorOnlyTriggers.sql
Adds AFTER UPDATE triggers on medium and release_group_primary/secondary_type; creates a_upd_medium_mirror and release_group type mirror functions; defines deferrable constraint triggers that fire only when child_order changes.
Replication triggers for new tables
mbslave/sql/CreateReplicationTriggers.sql, mbslave/sql/CreateReplicationTriggers2.sql, mbslave/sql/DropReplicationTriggers.sql, mbslave/sql/DropReplicationTriggers2.sql, mbslave/sql/CreateCustomReplicationTriggers2.sql, mbslave/sql/DropCustomReplicationTriggers2.sql
Adds reptg_medium_gid_redirect to base replication and reptg2_editor, reptg2_medium_gid_redirect, reptg2_recording_first_release_date to dbmirror2 system; creates sanitize_dbmirror2_editor_data BEFORE INSERT trigger for editor rows.
Editor data sanitization for privacy
mbslave/sql/CreateFunctions.sql, mbslave/sql/CreateViews.sql, mbslave/sql/DropFunctions.sql, mbslave/sql/DropViews.sql
Adds sanitize_editor and sanitize_editor_strict SQL functions that rebuild editor rows with sanitized/null fields and MD5-hashed name; creates sanitize_dbmirror2_editor_data trigger that rewrites pending_data JSON via JSON-to-record conversion and suppresses duplicate updates; creates editor_sanitized view.
Series relationship and constraint support
mbslave/sql/CreateConstraints.sql, mbslave/sql/CreateViews.sql
Extends series_type constraint to allow 'series' as entity_type; creates series_series view joining l_series_series, series, link, and link_attribute_text_value; adds link type, attribute type, and orderable link type configuration for "part of" series relationships.
Release label catalog number constraints
mbslave/sql/CreateConstraints.sql, mbslave/sql/CreateTables.sql, mbslave/sql/updates/20260212-mbs-14252.sql, mbslave/sql/updates/20260422-mbs-6551-all.sql, mbslave/sql/updates/20260422-mbs-6551-master_and_standalone.sql
Adds CHECK constraint for non-empty catalog_number; adds UNIQUE NULLS NOT DISTINCT constraint on (release, label, catalog_number) marked DEFERRABLE INITIALLY DEFERRED; removes source column from isrc and iswc tables.
Schema 30 and 31 comprehensive migrations
mbslave/sql/updates/schema-change/30.all.sql, mbslave/sql/updates/schema-change/31.all.sql, intermediate migration scripts (20240221 through 20260507)
Comprehensive schema consolidation scripts that orchestrate table partitioning, function/trigger creation, view updates, constraint additions, and gradual migration from interim migration scripts into unified versioned release scripts.
Cover art and event art index view updates
mbslave/sql/caa/CreateViews.sql, mbslave/sql/eaa/CreateViews.sql
Removes mime_type != 'application/pdf' filter from is_front determination in index_listing views; allows PDF cover art to be considered as front art candidate.
Drop/teardown SQL scripts
mbslave/sql/Drop*.sql files
Updates all drop scripts to properly clean up new triggers, functions, tables, views, indexes, and constraints introduced by this migration.

🎯 4 (Complex) | ⏱️ ~60 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
mbslave/sql/CreateMirrorOnlyFunctions.sql (1)

12-223: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Pin search_path for mirror trigger functions

mbslave/sql/CreateMirrorOnlyFunctions.sql defines all a_*_mirror() trigger functions with $$ LANGUAGE 'plpgsql'; and no SET search_path clause, yet the function bodies use unqualified names (e.g. INSERT INTO artist_release_pending_update ...). During replication apply, mbslave_sync_main uses connect_db(config) without set_search_path=True, and the repo has no ALTER ROLE/ALTER DATABASE ... SET search_path to provide a safety net—so trigger execution depends on the session search_path.

Restore per-function SET search_path = musicbrainz, public (or fully qualify the referenced objects) for each mirror function in this file.

Suggested fix pattern
-$$ LANGUAGE 'plpgsql';
+$$ LANGUAGE 'plpgsql' SET search_path = musicbrainz, public;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mbslave/sql/CreateMirrorOnlyFunctions.sql` around lines 12 - 223, All
a_*_mirror trigger functions use unqualified table names and rely on session
search_path; update each CREATE OR REPLACE FUNCTION (e.g. a_upd_release_mirror,
a_del_release_mirror, a_ins_release_event_mirror, a_upd_release_event_mirror,
a_del_release_event_mirror, a_ins_release_group_mirror,
a_upd_release_group_mirror, a_del_release_group_mirror,
a_upd_release_group_meta_mirror, a_ins_release_group_secondary_type_join_mirror,
a_del_release_group_secondary_type_join_mirror, a_ins_release_label_mirror,
a_upd_release_label_mirror, a_del_release_label_mirror, a_ins_track_mirror,
a_upd_track_mirror, a_del_track_mirror) to explicitly pin the search_path by
adding "SET search_path = musicbrainz, public" to the function definition (using
the CREATE FUNCTION ... LANGUAGE plpgsql SET search_path = ... form) or
alternatively fully-qualify all referenced objects; apply the same change to
every function in this file so trigger execution no longer depends on the
session search_path.
mbslave/sql/CreateFunctions.sql (1)

1955-1962: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Build the secondary-type arrays from the same ordering.

secondary_types is ordered by st.secondary_type, but secondary_type_child_orders is ordered by rgst.child_order. As soon as those two orders differ, the nth child order no longer belongs to the nth secondary type, so artist_release_group materializes mismatched pairs.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mbslave/sql/CreateFunctions.sql` around lines 1955 - 1962, The two parallel
arrays are built with different ORDER BY clauses causing mismatched pairs in
artist_release_group; change the array_agg for rgst.child_order to use the same
ORDER BY as the secondary types (i.e., ORDER BY st.secondary_type) so that
array_agg(DISTINCT rgst.child_order ORDER BY st.secondary_type) FILTER
(...)::SMALLINT[] produces child orders aligned with array_agg(DISTINCT
st.secondary_type ORDER BY st.secondary_type) FILTER (...)::SMALLINT[]; update
the array_agg call that references rgst.child_order to use ORDER BY
st.secondary_type (keeping the same DISTINCT and FILTER clauses) so both arrays
are constructed from the same ordering.
mbslave/sql/CreateTables.sql (1)

948-953: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

recording_first_release_date can’t be both replicated and mirror-maintained.

This table is now tagged -- replicate, but the changed function set still deletes/inserts it from mirror-triggered paths (set_recordings_first_release_dates() and a_upd_medium_mirror()). That puts mirrors back in the business of rewriting a replicated table, which breaks the contract those functions rely on.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mbslave/sql/CreateTables.sql` around lines 948 - 953, The table
recording_first_release_date is marked "-- replicate" but the mirror-triggered
functions set_recordings_first_release_dates() and a_upd_medium_mirror() still
delete/insert rows, which conflicts with replication vs mirror-maintained
ownership; either remove the "-- replicate" tag or stop the mirror functions
from modifying this table. Fix by choosing one ownership: if it should be
replicated, remove any delete/insert/maintenance logic for
recording_first_release_date from set_recordings_first_release_dates() and
a_upd_medium_mirror(); if it should be mirror-maintained, remove the "--
replicate" tag from the CREATE TABLE. Update only the referenced table
declaration or the two functions (set_recordings_first_release_dates,
a_upd_medium_mirror) so they no longer both claim responsibility.
♻️ Duplicate comments (1)
mbslave/sql/updates/20260507-search-756-standalone_only.sql (1)

31-35: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Same type mismatch as in 31.master_only.sql.

Cursor parameter oooseqid INTEGER should be BIGINT to match the BIGSERIAL seqid column. See earlier comment on 31.master_only.sql.

Also applies to: 72-78

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mbslave/sql/updates/20260507-search-756-standalone_only.sql` around lines 31
- 35, The cursor parameter declaration oooseqid currently typed as INTEGER must
be changed to BIGINT to match the BIGSERIAL seqid column; update the declaration
of oooseqid in this procedure/DO block (and the equivalent declarations in the
other occurrences around the 72–78 range) so the parameter type is BIGINT
wherever oooseqid is used, ensuring any related comparisons/casts use the BIGINT
type as well.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@mbslave/sql/CreateAllReplicationTriggers2.sql`:
- Around line 2-7: CreateAllReplicationTriggers2.sql's include list omits the
custom triggers file, so when used as the entrypoint dbmirror2's custom trigger
setup is skipped; add an include for the custom triggers (e.g., a line
referencing custom/CreateReplicationTriggers2.sql or the appropriate custom
triggers file name) into the include chain alongside the existing \ir entries
(CreateReplicationTriggers2.sql, caa/..., documentation/..., statistics/...,
eaa/..., wikidocs/...) so the custom triggers are loaded when
CreateAllReplicationTriggers2.sql is executed.

In `@mbslave/sql/DropAllReplicationTriggers2.sql`:
- Around line 2-7: DropAllReplicationTriggers2.sql currently only includes
standard trigger drop scripts and omits custom dbmirror2 triggers; update the
file to also include the custom trigger teardown by adding an include for the
custom drop script (for example add a line like “\ir
dbmirror2/DropReplicationTriggers2.sql” or the actual custom teardown script
name) alongside the existing includes so that DropAllReplicationTriggers2.sql
serves as a true global teardown that removes both standard and custom dbmirror2
triggers.

In `@mbslave/sql/updates/20241017-mbs-9253-13464.sql`:
- Around line 51-53: The bool_and expression uses magic numbers for r.status
(r.status != 1 AND r.status != 5); either replace these hardcoded IDs by
referencing the release status lookup (join to the release status enum/table) or
at minimum add an inline comment documenting the meanings (e.g., "-- status:
1=Official, 5=Withdrawn") next to the r.status checks so readers know what 1 and
5 represent; locate the expression around a_rg.artist and bool_and(r.status ...)
to apply the change.
- Around line 104-107: The INSERT into artist_release_group_pending_update
currently uses a bare INSERT INTO ... (SELECT id ...) which is fragile; change
it to specify the target column explicitly (e.g., INSERT INTO
artist_release_group_pending_update (release_group) SELECT id FROM release_group
WHERE release_group.type = OLD.id) and apply the same explicit-column change to
the other trigger function(s) that insert into
artist_release_group_pending_update so the insert remains correct if the table
schema changes.

In `@mbslave/sql/updates/20250320-mbs-13768.sql`:
- Around line 21-29: The SQL declares medium_gid_redirect with a comment
claiming new_id references medium.id but does not create a FOREIGN KEY; update
the migration so it is accurate: either add an explicit FK constraint (e.g.,
alter the CREATE TABLE for medium_gid_redirect to make new_id REFERENCES
medium(id) or add an ALTER TABLE ... ADD CONSTRAINT ... FOREIGN KEY on new_id)
if you intend referential integrity, or change the inline comment on the new_id
column (or the table comment) to state explicitly that new_id is intended to
reference medium.id but no FK is added for performance (e.g., “-- intended to
reference medium.id (no FK for performance)”) so the comment is not misleading;
locate the column new_id and the table medium_gid_redirect when applying the
change.

In `@mbslave/sql/updates/20260507-search-756-mirror_only.sql`:
- Around line 22-27: The pdcursor declaration uses INTEGER for the oooseqid
parameter but must match dbmirror2.pending_data.seqid (BIGINT); update the NO
SCROLL CURSOR declaration for pdcursor to use BIGINT instead of INTEGER and
update any invocations like pdcursor (oooseqid := oooseqid) to pass the BIGINT
value (ensure the oooseqid variable/parameter is defined/typed as BIGINT); apply
the same change to all scripts that declare pdcursor (and analogous cursors)
listed in the comment so seqid values outside INTEGER range no longer cause
"integer out of range" errors.

In `@mbslave/sql/updates/schema-change/31.master_only.sql`:
- Around line 87-96: The FOR loop uses an implicitly declared loop variable
pdrecord (FOR pdrecord IN pdcursor) and later accesses pdrecord.seqid;
explicitly declare pdrecord in the DECLARE block as a record or a specific row
type (for example "pdrecord RECORD" or "pdrecord
dbmirror2.pending_data%ROWTYPE") so its structure is clear and pdrecord.seqid is
type-checked; update the DECLARE section near the pdcursor/pdrecord usage and
ensure nextseqid and pdcursor references remain unchanged.
- Around line 25-31: The cursor pdcursor is declared with parameter oooseqid as
INTEGER but the seqid column in dbmirror2.pending_data is BIGINT (BIGSERIAL),
causing potential overflow/truncation; change the cursor parameter type from
INTEGER to BIGINT (update the pdcursor declaration referencing oooseqid) and
ensure any places that OPEN or FETCH into oooseqid and comparisons against seqid
use the BIGINT type so the types match (look for uses of oooseqid, pdcursor, and
seqid in the pending_data-related code paths).

---

Outside diff comments:
In `@mbslave/sql/CreateFunctions.sql`:
- Around line 1955-1962: The two parallel arrays are built with different ORDER
BY clauses causing mismatched pairs in artist_release_group; change the
array_agg for rgst.child_order to use the same ORDER BY as the secondary types
(i.e., ORDER BY st.secondary_type) so that array_agg(DISTINCT rgst.child_order
ORDER BY st.secondary_type) FILTER (...)::SMALLINT[] produces child orders
aligned with array_agg(DISTINCT st.secondary_type ORDER BY st.secondary_type)
FILTER (...)::SMALLINT[]; update the array_agg call that references
rgst.child_order to use ORDER BY st.secondary_type (keeping the same DISTINCT
and FILTER clauses) so both arrays are constructed from the same ordering.

In `@mbslave/sql/CreateMirrorOnlyFunctions.sql`:
- Around line 12-223: All a_*_mirror trigger functions use unqualified table
names and rely on session search_path; update each CREATE OR REPLACE FUNCTION
(e.g. a_upd_release_mirror, a_del_release_mirror, a_ins_release_event_mirror,
a_upd_release_event_mirror, a_del_release_event_mirror,
a_ins_release_group_mirror, a_upd_release_group_mirror,
a_del_release_group_mirror, a_upd_release_group_meta_mirror,
a_ins_release_group_secondary_type_join_mirror,
a_del_release_group_secondary_type_join_mirror, a_ins_release_label_mirror,
a_upd_release_label_mirror, a_del_release_label_mirror, a_ins_track_mirror,
a_upd_track_mirror, a_del_track_mirror) to explicitly pin the search_path by
adding "SET search_path = musicbrainz, public" to the function definition (using
the CREATE FUNCTION ... LANGUAGE plpgsql SET search_path = ... form) or
alternatively fully-qualify all referenced objects; apply the same change to
every function in this file so trigger execution no longer depends on the
session search_path.

In `@mbslave/sql/CreateTables.sql`:
- Around line 948-953: The table recording_first_release_date is marked "--
replicate" but the mirror-triggered functions
set_recordings_first_release_dates() and a_upd_medium_mirror() still
delete/insert rows, which conflicts with replication vs mirror-maintained
ownership; either remove the "-- replicate" tag or stop the mirror functions
from modifying this table. Fix by choosing one ownership: if it should be
replicated, remove any delete/insert/maintenance logic for
recording_first_release_date from set_recordings_first_release_dates() and
a_upd_medium_mirror(); if it should be mirror-maintained, remove the "--
replicate" tag from the CREATE TABLE. Update only the referenced table
declaration or the two functions (set_recordings_first_release_dates,
a_upd_medium_mirror) so they no longer both claim responsibility.

---

Duplicate comments:
In `@mbslave/sql/updates/20260507-search-756-standalone_only.sql`:
- Around line 31-35: The cursor parameter declaration oooseqid currently typed
as INTEGER must be changed to BIGINT to match the BIGSERIAL seqid column; update
the declaration of oooseqid in this procedure/DO block (and the equivalent
declarations in the other occurrences around the 72–78 range) so the parameter
type is BIGINT wherever oooseqid is used, ensuring any related comparisons/casts
use the BIGINT type as well.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: d720a1d2-ed00-4de0-a7c1-4343d8740f3a

📥 Commits

Reviewing files that changed from the base of the PR and between a6d1686 and f652ea4.

📒 Files selected for processing (74)
  • CHANGELOG.rst
  • README.rst
  • mbslave/__init__.py
  • mbslave/replication.py
  • mbslave/sql/CreateAllReplicationTriggers2.sql
  • mbslave/sql/CreateConstraints.sql
  • mbslave/sql/CreateCustomReplicationTriggers2.sql
  • mbslave/sql/CreateFKConstraints.sql
  • mbslave/sql/CreateFunctions.sql
  • mbslave/sql/CreateIndexes.sql
  • mbslave/sql/CreateMirrorOnlyFunctions.sql
  • mbslave/sql/CreateMirrorOnlyTriggers.sql
  • mbslave/sql/CreatePrimaryKeys.sql
  • mbslave/sql/CreateReplicationTriggers.sql
  • mbslave/sql/CreateReplicationTriggers2.sql
  • mbslave/sql/CreateTables.sql
  • mbslave/sql/CreateTriggers.sql
  • mbslave/sql/CreateViews.sql
  • mbslave/sql/DisableLastUpdatedTriggers.sql
  • mbslave/sql/DropAllReplicationTriggers2.sql
  • mbslave/sql/DropCustomReplicationTriggers2.sql
  • mbslave/sql/DropFKConstraints.sql
  • mbslave/sql/DropFunctions.sql
  • mbslave/sql/DropIndexes.sql
  • mbslave/sql/DropMirrorOnlyTriggers.sql
  • mbslave/sql/DropPrimaryKeys.sql
  • mbslave/sql/DropReplicationTriggers.sql
  • mbslave/sql/DropReplicationTriggers2.sql
  • mbslave/sql/DropTables.sql
  • mbslave/sql/DropTriggers.sql
  • mbslave/sql/DropViews.sql
  • mbslave/sql/EnableLastUpdatedTriggers.sql
  • mbslave/sql/InsertTestData.sql
  • mbslave/sql/TruncateTables.sql
  • mbslave/sql/caa/CreateMQTriggers.sql
  • mbslave/sql/caa/CreateViews.sql
  • mbslave/sql/caa/DropMQTriggers.sql
  • mbslave/sql/dbmirror2/README
  • mbslave/sql/dbmirror2/RefreshColumnInfo.sql
  • mbslave/sql/dbmirror2/ReplicationSetup.sql
  • mbslave/sql/dbmirror2/dbmirror2.sql
  • mbslave/sql/eaa/CreateMQTriggers.sql
  • mbslave/sql/eaa/CreateViews.sql
  • mbslave/sql/eaa/DropMQTriggers.sql
  • mbslave/sql/updates/20240221-mbs-13492.sql
  • mbslave/sql/updates/20240726-mbs-9373.sql
  • mbslave/sql/updates/20241017-mbs-9253-13464.sql
  • mbslave/sql/updates/20241017-mbs-9253-master_and_standalone.sql
  • mbslave/sql/updates/20241017-mbs-9253-mirror_only.sql
  • mbslave/sql/updates/20241125-mbs-13832.sql
  • mbslave/sql/updates/20250320-mbs-13768-fks.sql
  • mbslave/sql/updates/20250320-mbs-13768.sql
  • mbslave/sql/updates/20250408-mbs-13322.sql
  • mbslave/sql/updates/20250408-mbs-13964-all.sql
  • mbslave/sql/updates/20250425-mbs-13464-master_and_standalone.sql
  • mbslave/sql/updates/20250425-mbs-13464.sql
  • mbslave/sql/updates/20250425-mbs-13966.sql
  • mbslave/sql/updates/20260121-mbs-14092-fks.sql
  • mbslave/sql/updates/20260121-mbs-14092.sql
  • mbslave/sql/updates/20260212-mbs-14252.sql
  • mbslave/sql/updates/20260422-mbs-6551-all.sql
  • mbslave/sql/updates/20260422-mbs-6551-master_and_standalone.sql
  • mbslave/sql/updates/20260502-search-756-all.sql
  • mbslave/sql/updates/20260502-search-756-master_only.sql
  • mbslave/sql/updates/20260507-search-756-mirror_only.sql
  • mbslave/sql/updates/20260507-search-756-standalone_only.sql
  • mbslave/sql/updates/schema-change/30.all.sql
  • mbslave/sql/updates/schema-change/30.master_and_standalone.sql
  • mbslave/sql/updates/schema-change/30.mirror_only.sql
  • mbslave/sql/updates/schema-change/31.all.sql
  • mbslave/sql/updates/schema-change/31.master_and_standalone.sql
  • mbslave/sql/updates/schema-change/31.master_only.sql
  • mbslave/sql/updates/schema-change/31.mirror_only.sql
  • mbslave/sql/updates/schema-change/31.standalone_only.sql
💤 Files with no reviewable changes (8)
  • mbslave/sql/caa/DropMQTriggers.sql
  • mbslave/sql/eaa/DropMQTriggers.sql
  • mbslave/sql/dbmirror2/ReplicationSetup.sql
  • mbslave/sql/dbmirror2/RefreshColumnInfo.sql
  • mbslave/sql/caa/CreateViews.sql
  • mbslave/sql/caa/CreateMQTriggers.sql
  • mbslave/sql/eaa/CreateMQTriggers.sql
  • mbslave/sql/eaa/CreateViews.sql

Comment on lines +2 to +7
\ir CreateReplicationTriggers2.sql
\ir caa/CreateReplicationTriggers2.sql
\ir documentation/CreateReplicationTriggers2.sql
\ir statistics/CreateReplicationTriggers2.sql
\ir eaa/CreateReplicationTriggers2.sql
\ir wikidocs/CreateReplicationTriggers2.sql
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

CreateAllReplicationTriggers2.sql doesn’t include custom trigger creation.

Line 2 starts the “all triggers” include chain, but custom triggers are not included. That can leave dbmirror2 trigger setup incomplete when this helper is used as the entrypoint.

Suggested patch
 -- Helper script to create all dbmirror2 replication triggers.
 \ir CreateReplicationTriggers2.sql
+\ir CreateCustomReplicationTriggers2.sql
 \ir caa/CreateReplicationTriggers2.sql
 \ir documentation/CreateReplicationTriggers2.sql
 \ir statistics/CreateReplicationTriggers2.sql
 \ir eaa/CreateReplicationTriggers2.sql
 \ir wikidocs/CreateReplicationTriggers2.sql
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
\ir CreateReplicationTriggers2.sql
\ir caa/CreateReplicationTriggers2.sql
\ir documentation/CreateReplicationTriggers2.sql
\ir statistics/CreateReplicationTriggers2.sql
\ir eaa/CreateReplicationTriggers2.sql
\ir wikidocs/CreateReplicationTriggers2.sql
\ir CreateReplicationTriggers2.sql
\ir CreateCustomReplicationTriggers2.sql
\ir caa/CreateReplicationTriggers2.sql
\ir documentation/CreateReplicationTriggers2.sql
\ir statistics/CreateReplicationTriggers2.sql
\ir eaa/CreateReplicationTriggers2.sql
\ir wikidocs/CreateReplicationTriggers2.sql
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mbslave/sql/CreateAllReplicationTriggers2.sql` around lines 2 - 7,
CreateAllReplicationTriggers2.sql's include list omits the custom triggers file,
so when used as the entrypoint dbmirror2's custom trigger setup is skipped; add
an include for the custom triggers (e.g., a line referencing
custom/CreateReplicationTriggers2.sql or the appropriate custom triggers file
name) into the include chain alongside the existing \ir entries
(CreateReplicationTriggers2.sql, caa/..., documentation/..., statistics/...,
eaa/..., wikidocs/...) so the custom triggers are loaded when
CreateAllReplicationTriggers2.sql is executed.

Comment on lines +2 to +7
\ir DropReplicationTriggers2.sql
\ir caa/DropReplicationTriggers2.sql
\ir documentation/DropReplicationTriggers2.sql
\ir statistics/DropReplicationTriggers2.sql
\ir eaa/DropReplicationTriggers2.sql
\ir wikidocs/DropReplicationTriggers2.sql
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

DropAllReplicationTriggers2.sql omits custom trigger teardown.

Lines 2-7 drop standard trigger sets only. If this helper is used as the global teardown entrypoint, custom dbmirror2 triggers remain installed.

Suggested patch
 -- Helper script to drop all dbmirror2 replication triggers.
 \ir DropReplicationTriggers2.sql
+\ir DropCustomReplicationTriggers2.sql
 \ir caa/DropReplicationTriggers2.sql
 \ir documentation/DropReplicationTriggers2.sql
 \ir statistics/DropReplicationTriggers2.sql
 \ir eaa/DropReplicationTriggers2.sql
 \ir wikidocs/DropReplicationTriggers2.sql
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
\ir DropReplicationTriggers2.sql
\ir caa/DropReplicationTriggers2.sql
\ir documentation/DropReplicationTriggers2.sql
\ir statistics/DropReplicationTriggers2.sql
\ir eaa/DropReplicationTriggers2.sql
\ir wikidocs/DropReplicationTriggers2.sql
-- Helper script to drop all dbmirror2 replication triggers.
\ir DropReplicationTriggers2.sql
\ir DropCustomReplicationTriggers2.sql
\ir caa/DropReplicationTriggers2.sql
\ir documentation/DropReplicationTriggers2.sql
\ir statistics/DropReplicationTriggers2.sql
\ir eaa/DropReplicationTriggers2.sql
\ir wikidocs/DropReplicationTriggers2.sql
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mbslave/sql/DropAllReplicationTriggers2.sql` around lines 2 - 7,
DropAllReplicationTriggers2.sql currently only includes standard trigger drop
scripts and omits custom dbmirror2 triggers; update the file to also include the
custom trigger teardown by adding an include for the custom drop script (for
example add a line like “\ir dbmirror2/DropReplicationTriggers2.sql” or the
actual custom teardown script name) alongside the existing includes so that
DropAllReplicationTriggers2.sql serves as a true global teardown that removes
both standard and custom dbmirror2 triggers.

Comment on lines +51 to +53
a_rg.artist,
-- Withdrawn releases were once official by definition
bool_and(r.status IS NOT NULL AND r.status != 1 AND r.status != 5),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | 💤 Low value

Magic numbers for release status.

r.status != 1 AND r.status != 5 uses hardcoded status IDs. A comment explaining what 1 and 5 mean would help:

-- status: 1=Official, 5=Withdrawn

The existing comment on line 52 hints at it ("Withdrawn releases were once official") but doesn't explain the numbers.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mbslave/sql/updates/20241017-mbs-9253-13464.sql` around lines 51 - 53, The
bool_and expression uses magic numbers for r.status (r.status != 1 AND r.status
!= 5); either replace these hardcoded IDs by referencing the release status
lookup (join to the release status enum/table) or at minimum add an inline
comment documenting the meanings (e.g., "-- status: 1=Official, 5=Withdrawn")
next to the r.status checks so readers know what 1 and 5 represent; locate the
expression around a_rg.artist and bool_and(r.status ...) to apply the change.

Comment on lines +104 to +107
INSERT INTO artist_release_group_pending_update (
SELECT id FROM release_group
WHERE release_group.type = OLD.id
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

INSERT without explicit column list is fragile.

INSERT INTO artist_release_group_pending_update (
    SELECT id FROM release_group

If artist_release_group_pending_update table structure changes, this breaks silently or noisily depending on the change. Best practice is explicit columns:

INSERT INTO artist_release_group_pending_update (release_group)
    SELECT id FROM release_group
Proposed fix for both trigger functions
-        INSERT INTO artist_release_group_pending_update (
-            SELECT id FROM release_group
+        INSERT INTO artist_release_group_pending_update (release_group)
+            SELECT id FROM release_group
             WHERE release_group.type = OLD.id
         );
-        INSERT INTO artist_release_group_pending_update (
-            SELECT release_group
+        INSERT INTO artist_release_group_pending_update (release_group)
+            SELECT release_group
             FROM release_group_secondary_type_join
             WHERE secondary_type = OLD.id
         );
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mbslave/sql/updates/20241017-mbs-9253-13464.sql` around lines 104 - 107, The
INSERT into artist_release_group_pending_update currently uses a bare INSERT
INTO ... (SELECT id ...) which is fragile; change it to specify the target
column explicitly (e.g., INSERT INTO artist_release_group_pending_update
(release_group) SELECT id FROM release_group WHERE release_group.type = OLD.id)
and apply the same explicit-column change to the other trigger function(s) that
insert into artist_release_group_pending_update so the insert remains correct if
the table schema changes.

Comment on lines +21 to +29
CREATE TABLE medium_gid_redirect ( -- replicate (verbose)
gid UUID NOT NULL, -- PK
new_id INTEGER NOT NULL, -- references medium.id
created TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);

ALTER TABLE medium_gid_redirect ADD CONSTRAINT medium_gid_redirect_pkey PRIMARY KEY (gid);

CREATE INDEX medium_gid_redirect_idx_new_id ON medium_gid_redirect (new_id);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | 💤 Low value

Comment claims FK reference but no constraint is created.

Line 23 says -- references medium.id but there's no actual FOREIGN KEY constraint. The comment is misleading. Either:

  1. Add the FK: REFERENCES medium(id)
  2. Remove the misleading comment

Looking at other *_gid_redirect tables in MusicBrainz schema, they typically don't have FKs for performance reasons during merges. If that's intentional here, the comment should say -- intended to reference medium.id (no FK for performance) or just drop it.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mbslave/sql/updates/20250320-mbs-13768.sql` around lines 21 - 29, The SQL
declares medium_gid_redirect with a comment claiming new_id references medium.id
but does not create a FOREIGN KEY; update the migration so it is accurate:
either add an explicit FK constraint (e.g., alter the CREATE TABLE for
medium_gid_redirect to make new_id REFERENCES medium(id) or add an ALTER TABLE
... ADD CONSTRAINT ... FOREIGN KEY on new_id) if you intend referential
integrity, or change the inline comment on the new_id column (or the table
comment) to state explicitly that new_id is intended to reference medium.id but
no FK is added for performance (e.g., “-- intended to reference medium.id (no FK
for performance)”) so the comment is not misleading; locate the column new_id
and the table medium_gid_redirect when applying the change.

Comment on lines +22 to +27
pdcursor NO SCROLL CURSOR (oooseqid INTEGER) FOR
SELECT seqid
FROM dbmirror2.pending_data
WHERE xid = txid_current()
AND seqid >= oooseqid
ORDER BY seqid DESC
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify type mismatch locations in changed SQL files
rg -n -C2 'pdcursor\s+NO SCROLL CURSOR\s*\(oooseqid\s+INTEGER\)|oooseqid\s+BIGINT|seqid\s+BIGINT' \
  mbslave/sql/updates/20260507-search-756-mirror_only.sql \
  mbslave/sql/updates/schema-change/31.mirror_only.sql

Repository: acoustid/mbslave

Length of output: 1754


🏁 Script executed:

#!/bin/bash
set -euo pipefail

files=(
  "mbslave/sql/updates/20260507-search-756-mirror_only.sql"
  "mbslave/sql/updates/schema-change/31.mirror_only.sql"
)

for f in "${files[@]}"; do
  echo "===== $f (cursor declare/open context) ====="
  # Show around cursor declaration and around OPEN statements, if any
  rg -n "pdcursor" "$f" -n || true
  rg -n "OPEN\s+pdcursor" "$f" -n || true

  # Print a small window around cursor declaration line (the file is small enough around there)
  start=1
  # Use line numbers from previous grep: cursor declare at line ~22/28; dump a bit more context
  # We'll just dump lines 1-120 (these files look small; adjust if needed)
  sed -n '1,140p' "$f" | cat -n
done

echo "===== pending_data seqid type (search) ====="
# Find where pending_data is created/defined; best-effort search for "pending_data" and "seqid" in repo
rg -n "dbmirror2\.pending_data|pending_data" mbslave/sql -S || true
rg -n "CREATE\s+(TABLE|VIEW).*pending_data|CREATE\s+OR\s+REPLACE\s+(VIEW|MATERIALIZED\s+VIEW).*pending_data" mbslave/sql -S || true
rg -n "pending_data.*seqid|seqid.*pending_data" mbslave/sql -S || true

Repository: acoustid/mbslave

Length of output: 27937


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show pending_data.seqid type from canonical schema locations
candidates=(
  "mbslave/sql/dbmirror2/dbmirror2.sql"
  "mbslave/sql/updates/20220328-mbs-12250-mirror_only.sql"
)

for f in "${candidates[@]}"; do
  echo "===== $f: pending_data table definition around CREATE TABLE ====="
  # Print around the CREATE TABLE block where pending_data is defined
  # (limit output)
  rg -n "CREATE TABLE dbmirror2\.pending_data" "$f" -n
  # dump a window after that line
  line=$(rg -n "CREATE TABLE dbmirror2\.pending_data" "$f" | head -n1 | cut -d: -f1)
  start=$((line-5))
  end=$((line+80))
  sed -n "${start},${end}p" "$f" | cat -n
done

# Also check whether seqid is redefined later in those files
for f in "${candidates[@]}"; do
  echo "===== $f: any later ALTER TABLE on pending_data.seqid ====="
  rg -n "ALTER TABLE dbmirror2\.pending_data.*seqid|pending_data.*seqid" "$f" -n | head -n 50 || true
done

Repository: acoustid/mbslave

Length of output: 3312


🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n "pdcursor\s+NO SCROLL CURSOR\s*\(oooseqid\s+INTEGER\)" mbslave/sql/updates -S || true

Repository: acoustid/mbslave

Length of output: 1201


Use BIGINT for pdcursor parameter (oooseqid) to match dbmirror2.pending_data.seqid

oooseqid is BIGINT, but pdcursor is declared as NO SCROLL CURSOR (oooseqid INTEGER) and is invoked via pdcursor (oooseqid := oooseqid). Since dbmirror2.pending_data.seqid is BIGSERIAL/BIGINT, passing values beyond the INTEGER range can throw (e.g., “integer out of range”) and break the out-of-order repair path. This exact cursor declaration mismatch exists across multiple scripts, including:

  • mbslave/sql/updates/20260507-search-756-mirror_only.sql
  • mbslave/sql/updates/schema-change/31.mirror_only.sql
  • mbslave/sql/updates/20260507-search-756-standalone_only.sql
  • mbslave/sql/updates/20260502-search-756-master_only.sql
  • mbslave/sql/updates/20220328-mbs-12250-master.sql
  • mbslave/sql/updates/20230423-nix-pending-keys-insertion.sql
  • mbslave/sql/updates/schema-change/31.{master_only,standalone_only}.sql
  • mbslave/sql/updates/schema-change/27.master_only.sql
  • mbslave/sql/updates/schema-change/28.master_only.sql
Proposed fix
-    pdcursor    NO SCROLL CURSOR (oooseqid INTEGER) FOR
+    pdcursor    NO SCROLL CURSOR (oooseqid BIGINT) FOR
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pdcursor NO SCROLL CURSOR (oooseqid INTEGER) FOR
SELECT seqid
FROM dbmirror2.pending_data
WHERE xid = txid_current()
AND seqid >= oooseqid
ORDER BY seqid DESC
pdcursor NO SCROLL CURSOR (oooseqid BIGINT) FOR
SELECT seqid
FROM dbmirror2.pending_data
WHERE xid = txid_current()
AND seqid >= oooseqid
ORDER BY seqid DESC
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mbslave/sql/updates/20260507-search-756-mirror_only.sql` around lines 22 -
27, The pdcursor declaration uses INTEGER for the oooseqid parameter but must
match dbmirror2.pending_data.seqid (BIGINT); update the NO SCROLL CURSOR
declaration for pdcursor to use BIGINT instead of INTEGER and update any
invocations like pdcursor (oooseqid := oooseqid) to pass the BIGINT value
(ensure the oooseqid variable/parameter is defined/typed as BIGINT); apply the
same change to all scripts that declare pdcursor (and analogous cursors) listed
in the comment so seqid values outside INTEGER range no longer cause "integer
out of range" errors.

Comment on lines +25 to +31
-- out-of-order seqid
oooseqid BIGINT;
oootrgdepth INTEGER;
pdcursor NO SCROLL CURSOR (oooseqid INTEGER) FOR
SELECT seqid
FROM dbmirror2.pending_data
WHERE xid = txid_current()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Type mismatch between cursor parameter and assigned value.

Cursor declares oooseqid INTEGER (line 28), but line 75 assigns from seqid which comes from a BIGSERIAL column (so it's BIGINT). This works because of implicit casting, but if seqid values exceed INTEGER range (~2.1B), you'll get overflow or truncation in the cursor filter.

Given the seqid column is BIGSERIAL, the cursor parameter should be BIGINT to match.

Proposed fix
     pdcursor    NO SCROLL CURSOR (oooseqid INTEGER) FOR
+    pdcursor    NO SCROLL CURSOR (oooseqid BIGINT) FOR

Also applies to: 72-79

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mbslave/sql/updates/schema-change/31.master_only.sql` around lines 25 - 31,
The cursor pdcursor is declared with parameter oooseqid as INTEGER but the seqid
column in dbmirror2.pending_data is BIGINT (BIGSERIAL), causing potential
overflow/truncation; change the cursor parameter type from INTEGER to BIGINT
(update the pdcursor declaration referencing oooseqid) and ensure any places
that OPEN or FETCH into oooseqid and comparisons against seqid use the BIGINT
type so the types match (look for uses of oooseqid, pdcursor, and seqid in the
pending_data-related code paths).

Comment on lines +87 to +96
oootrgdepth, pg_trigger_depth(), TG_OP, _tablename, OLD, NEW;
END IF;

FOR pdrecord IN pdcursor (oooseqid := oooseqid) LOOP
UPDATE dbmirror2.pending_data
SET seqid = nextseqid
WHERE CURRENT OF pdcursor;

nextseqid := pdrecord.seqid;
END LOOP;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | 💤 Low value

Undeclared loop variable pdrecord.

The FOR pdrecord IN pdcursor loop references pdrecord which isn't declared in the DECLARE block. PL/pgSQL does implicitly declare FOR loop variables, but accessing pdrecord.seqid on line 95 assumes a record type. This should work, but it's a bit sloppy - explicit declaration would make the code self-documenting.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mbslave/sql/updates/schema-change/31.master_only.sql` around lines 87 - 96,
The FOR loop uses an implicitly declared loop variable pdrecord (FOR pdrecord IN
pdcursor) and later accesses pdrecord.seqid; explicitly declare pdrecord in the
DECLARE block as a record or a specific row type (for example "pdrecord RECORD"
or "pdrecord dbmirror2.pending_data%ROWTYPE") so its structure is clear and
pdrecord.seqid is type-checked; update the DECLARE section near the
pdcursor/pdrecord usage and ensure nextseqid and pdcursor references remain
unchanged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant