Skip to content

New DBLP invitation — migrate authors to author{} object schema#2846

Open
carlosmondra wants to merge 26 commits intomasterfrom
feature/new-dblp-invitation
Open

New DBLP invitation — migrate authors to author{} object schema#2846
carlosmondra wants to merge 26 commits intomasterfrom
feature/new-dblp-invitation

Conversation

@carlosmondra
Copy link
Copy Markdown
Member

@carlosmondra carlosmondra commented Feb 9, 2026

Summary

Migrates the DBLP / arXiv / ORCID public-article import flow from the legacy parallel content.authors + content.authorids string arrays to the new author{} object schema, where content.authors.value is a single list of { fullname, username } objects. Updates every Public_Article invitation, pre-process, and post-process that touches the authors field, and extends the profile name-removal flow so names linked to new-schema publications are rewritten (or refused for auto-accept) correctly.

Motivation

The API's new author{} invitation type stores authorship as a single list of typed objects instead of two parallel arrays that callers had to zip and keep in sync. It fixes a class of bugs where authors[i] and authorids[i] drifted out of order, and it cleanly supports partial authorship info (e.g. a DBLP author with no OpenReview profile yet). All existing Authorship_Claim / Author_Removal flows had to be rebuilt against the new shape, and the name-removal support flow had to learn the new schema so it can still rewrite usernames on publications that contain a removed alternate name.

Changes

Record ingestion: DBLP / arXiv / ORCID

Public_Article invitations

  • openreview/profile/management.pyAuthorship_Claim and Author_Removal invitations now:
    • Require a new author_name content field alongside author_index / author_id.
    • Replace the target author with a { fullname, username } object at the given index (hidden const.replace), instead of writing the username into authorids[index].
    • Author_Removal now binds to author_removal_pre_process.js (previously re-used the coreference pre-process).
    • Article body schema replaces authors: string[] + authorids: string[] with authors: author{} (typed { fullname, username }).
  • openreview.net/Public_Article/-/Edit gains openreview.net/Support as an invitee/reader so the name-removal decision process can sign corrective edits.

Pre-process validation

Profile name removal

  • openreview/profile/process/request_remove_name_decision_process.py — the per-publication loop detects whether authors.value is a list of dicts (new schema) or a list of strings (legacy), and routes writes through the correct shape. For new-schema notes it rewrites { fullname, username } in place to the profile's preferred name / id; for legacy notes the existing dual-array update is unchanged.
  • openreview/profile/process/request_remove_name_process.py — the auto-accept heuristic's "does the user have publications?" check now uses the new Note.authors property, so a request whose username is only linked via new-schema DBLP records is not auto-accepted.

SDK

  • openreview/api/client.py — new read-only Note.authors property. Returns a canonical [{ fullname, username }] list regardless of which schema the note stores. Serialization (to_json) is untouched, so round-trips preserve the wire format.

Tests

New

  • tests/test_profile_management.pytest_remove_name_with_dblp_publication: end-to-end test where a user imports a openreview.net/Public_Article/DBLP.org/-/Record note, claims authorship with an alternate name via Authorship_Claim, requests removal of the alternate name, support accepts, and the DBLP record's authors[0] is rewritten from { fullname: "Edith Alternate Last", username: "~Edith_Alternate_Last1" } to { fullname: "Edith Last", username: "~Edith_Last1" }.

Updated

  • tests/test_profile_management.py — existing DBLP / arXiv / ORCID import tests updated to post authors: [{fullname, username}] objects and assert against the new shape; Authorship_Claim / Author_Removal calls pass author_name; error=True removed from the await_queue_edit calls that used to tolerate post-process failures.
  • tests/test_iclr_conference_v2.py — asserts that guest search hides authors/authorids while PC search still returns them, verifying the new indexing against the migrated schema.
  • tests/test_abstract_deadline.py — awaits the newly-added Deletion-0-1 process queue now emitted by the workflow.

Test plan

  • pytest tests/test_profile_management.py passes end-to-end.
  • pytest tests/test_profile_management.py::TestProfileManagement::test_remove_name_with_dblp_publication passes (covers the name-removal → DBLP rewrite path).
  • pytest tests/test_iclr_conference_v2.py passes (indexing assertions).
  • pytest tests/test_abstract_deadline.py passes.
  • Manually import a DBLP record, claim authorship with an alternate name, then request removal of that name — the resulting DBLP record shows the preferred name + id in authors[claimed_index].
  • Verify ORCID ingestion still produces "First Last" order for records where the upstream returns "Last, First" (covered by TODO block in orcid_record_process.js).

Comment thread openreview/profile/process/deprecated_dblp_record_process.js
carlosmondra and others added 2 commits February 11, 2026 14:43
- Update DBLP, arXiv, ORCID Record invitation schemas from separate
  authors (string[]) and authorids (string[]) to single authors (author{})
  field with {fullname, username} objects
- Add author_name field to Authorship_Claim and Author_Removal invitations
  and use object replacement in const.replace
- Update record process scripts to preserve existing usernames from author
  objects instead of authorids
- Update preprocess scripts to read fullname/username from author objects
- Add backward compatibility in deprecated DBLP process script to convert
  author{} output back to legacy string arrays
- Update tests for new format

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The published @openreview/client strips commas from ORCID credit-name
fields ("Last, First" → "Last First") instead of reordering them
("First Last"). Parse contributor names directly from the ORCID JSON
in the process script to ensure correct ordering regardless of library
version. Includes TODO markers for cleanup once the library is updated.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@carlosmondra carlosmondra force-pushed the feature/new-dblp-invitation branch from 49978ce to 289389e Compare February 11, 2026 19:43
@xkopenreview
Copy link
Copy Markdown
Contributor

i thought author schema of DBLP.org/-/Record will not be changed, only public_article/dblp will have object author?

so currently dblp import use DBLP.org/-/Record with string author schema
web should change dblp import to use public_article/dblp (object schema), for papers imports by coauthors using DBLP.org/-/Record still update author and authorids array
the imported papers using DBLP.org/-/Record will be migrated to author schema?
web remove logic related to DBLP.org/-/Recrod and v1 dblp

is the above process correct?

@melisabok
Copy link
Copy Markdown
Member

i thought author schema of DBLP.org/-/Record will not be changed, only public_article/dblp will have object author?
Yes, that should be the case.

@melisabok
Copy link
Copy Markdown
Member

This PR is not modifying the DBLP.org/-/Record invitation.

Copilot AI review requested due to automatic review settings March 30, 2026 17:16
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Migrates Public Article record import and authorship-claim flows (DBLP/arXiv/ORCID) from legacy authors (string[]) + authorids (string[]) into a single structured authors field ({fullname, username} objects), updating processing scripts and tests accordingly.

Changes:

  • Update DBLP/arXiv/ORCID record processing to produce and maintain content.authors.value as {fullname, username} objects.
  • Update Authorship_Claim / Author_Removal invitation schemas to replace an author entry with a {fullname, username} object (introducing author_name).
  • Update tests to assert the new authors object format and the updated claim behavior.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
tests/test_profile_management.py Updates expected imported note content and authorship-claim payloads/assertions for the new authors object format.
openreview/profile/process/orcid_record_process.js Builds authors objects directly from ORCID JSON and preserves existing usernames.
openreview/profile/process/deprecated_dblp_record_process.js Adds backward-compat conversion from authors objects back to legacy authors/authorids arrays.
openreview/profile/process/dblp_record_process.js Converts legacy converter output to authors objects and attempts to preserve existing usernames.
openreview/profile/process/author_removal_pre_process.js Switches validation from authorids to authors[].username for removal checks.
openreview/profile/process/author_coreference_pre_process.js Switches validation from authorids/string authors to authors[] objects (username/fullname).
openreview/profile/process/arxiv_record_process.js Converts legacy converter output to authors objects and attempts to preserve existing usernames.
openreview/profile/management.py Updates invitation schemas for new authors object format and adds author_name to claim/removal inputs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread openreview/profile/management.py
if (p['contributor-orcid']?.uri) {
username = p['contributor-orcid'].uri;
} else if (fullname) {
username = `https://orcid.org/orcid-search/search?searchQuery=${fullname}`;
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback ORCID search URL is built with the raw fullname string. Because names can contain spaces and punctuation, this can generate invalid URLs (or unexpected query parsing). URL-encode the query value (e.g., via encodeURIComponent) before interpolating it into searchQuery=.

Suggested change
username = `https://orcid.org/orcid-search/search?searchQuery=${fullname}`;
username = `https://orcid.org/orcid-search/search?searchQuery=${encodeURIComponent(fullname)}`;

Copilot uses AI. Check for mistakes.
Comment thread openreview/profile/process/orcid_record_process.js
Comment thread openreview/profile/management.py
Comment thread openreview/profile/process/dblp_record_process.js
Comment thread openreview/profile/process/arxiv_record_process.js
return Promise.reject(new OpenReviewError({ name: 'Error', message: `The author name ${authorName} from index ${authorIndex} doesn't match with the names listed in your profile` }));
const authorUsername = publication.content.authors.value[authorIndex]?.username;
if (!usernames.some(username => username === authorUsername)) {
return Promise.reject(new OpenReviewError({ name: 'Error', message: `The author name ${authorUsername} from index ${authorIndex} doesn't match with the names listed in your profile` }));
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message says "author name" but the value being checked and interpolated here is the author's username (authorUsername). Updating the message to refer to "author username" (or using the fullname instead) will make failures easier to understand.

Suggested change
return Promise.reject(new OpenReviewError({ name: 'Error', message: `The author name ${authorUsername} from index ${authorIndex} doesn't match with the names listed in your profile` }));
return Promise.reject(new OpenReviewError({ name: 'Error', message: `The author username ${authorUsername} from index ${authorIndex} doesn't match with the usernames listed in your profile` }));

Copilot uses AI. Check for mistakes.
Comment thread openreview/profile/process/author_removal_pre_process.js
const userProfile = profiles[0];

const usernames = userProfile.content.names.map(name => name.username);
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pre-process script is not referenced anywhere in the repository (no invitation points to process/author_removal_pre_process.js). If Author_Removal is intended to use it, wire it up in openreview/profile/management.py; otherwise consider removing it to avoid maintaining dead code.

Suggested change
const usernames = userProfile.content.names.map(name => name.username);
const names = Array.isArray(userProfile?.content?.names) ? userProfile.content.names : [];
const usernames = names
.map(name => name && name.username)
.filter(username => !!username);

Copilot uses AI. Check for mistakes.
Comment on lines 1501 to +1503
note = josiah_client.get_note(edit['note']['id'])
assert note.external_ids == ['doi:10.1103/physreva.109.022426']
assert '~Josiah_Couch1' == note.content['authorids']['value'][0]
assert note.content['authors']['value'][0] == {'fullname': 'Josiah Couch', 'username': '~Josiah_Couch1'}
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this test, there is an Authorship_Claim attempt shortly after this assertion that omits the newly-added author_name field. Since author_name is required by the updated invitation schema, that claim is likely to fail schema validation before the pre-process error you’re asserting is raised. Update the failing claim attempt to include author_name (or make author_name optional in the invitation if omission should be supported).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

openreview/profile/process/deprecated_dblp_record_process.js:44

  • abstractError is assigned in the catch block but never used after removing the throw at the end of the function. This is now dead code and can be removed (and the catch can just log), or reintroduce the throw if failures should still fail the queue edit.
  const html = note.content.html?.value;
  let abstractError = false;

  try {
    if (html) {
      const { abstract, pdf, error } = await Tools.extractAbstract(html);
      console.log('abstract: ' + abstract);
      console.log('pdf: ' + pdf);
      console.log('error: ' + error);
      if (abstract) {
        note.content.abstract = { value: abstract };
      }
      if (pdf) {
        note.content.pdf = { value: pdf };
      }
    } else {
      console.log('html field is empty');
    }
  } catch (error) {
    console.log('error: ' + JSON.stringify(error?.toJson?.()));
    abstractError = error;
  }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread openreview/profile/management.py
Comment thread openreview/profile/process/author_coreference_pre_process.js
Comment thread openreview/profile/process/author_removal_pre_process.js
Comment thread openreview/profile/process/author_removal_pre_process.js
Comment thread openreview/profile/process/deprecated_dblp_record_process.js Outdated
@melisabok melisabok changed the title Migrate Record/Authorship_Claim invitations to author{} object format New DBLP invitation — migrate authors to author{} object schema Apr 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants