New DBLP invitation — migrate authors to author{} object schema#2846
New DBLP invitation — migrate authors to author{} object schema#2846carlosmondra wants to merge 26 commits intomasterfrom
author{} object schema#2846Conversation
- Update DBLP, arXiv, ORCID Record invitation schemas from separate
authors (string[]) and authorids (string[]) to single authors (author{})
field with {fullname, username} objects
- Add author_name field to Authorship_Claim and Author_Removal invitations
and use object replacement in const.replace
- Update record process scripts to preserve existing usernames from author
objects instead of authorids
- Update preprocess scripts to read fullname/username from author objects
- Add backward compatibility in deprecated DBLP process script to convert
author{} output back to legacy string arrays
- Update tests for new format
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The published @openreview/client strips commas from ORCID credit-name
fields ("Last, First" → "Last First") instead of reordering them
("First Last"). Parse contributor names directly from the ORCID JSON
in the process script to ensure correct ordering regardless of library
version. Includes TODO markers for cleanup once the library is updated.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
49978ce to
289389e
Compare
|
i thought author schema of DBLP.org/-/Record will not be changed, only public_article/dblp will have object author? so currently dblp import use DBLP.org/-/Record with string author schema is the above process correct? |
|
|
This PR is not modifying the DBLP.org/-/Record invitation. |
There was a problem hiding this comment.
Pull request overview
Migrates Public Article record import and authorship-claim flows (DBLP/arXiv/ORCID) from legacy authors (string[]) + authorids (string[]) into a single structured authors field ({fullname, username} objects), updating processing scripts and tests accordingly.
Changes:
- Update DBLP/arXiv/ORCID record processing to produce and maintain
content.authors.valueas{fullname, username}objects. - Update Authorship_Claim / Author_Removal invitation schemas to replace an author entry with a
{fullname, username}object (introducingauthor_name). - Update tests to assert the new
authorsobject format and the updated claim behavior.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
tests/test_profile_management.py |
Updates expected imported note content and authorship-claim payloads/assertions for the new authors object format. |
openreview/profile/process/orcid_record_process.js |
Builds authors objects directly from ORCID JSON and preserves existing usernames. |
openreview/profile/process/deprecated_dblp_record_process.js |
Adds backward-compat conversion from authors objects back to legacy authors/authorids arrays. |
openreview/profile/process/dblp_record_process.js |
Converts legacy converter output to authors objects and attempts to preserve existing usernames. |
openreview/profile/process/author_removal_pre_process.js |
Switches validation from authorids to authors[].username for removal checks. |
openreview/profile/process/author_coreference_pre_process.js |
Switches validation from authorids/string authors to authors[] objects (username/fullname). |
openreview/profile/process/arxiv_record_process.js |
Converts legacy converter output to authors objects and attempts to preserve existing usernames. |
openreview/profile/management.py |
Updates invitation schemas for new authors object format and adds author_name to claim/removal inputs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (p['contributor-orcid']?.uri) { | ||
| username = p['contributor-orcid'].uri; | ||
| } else if (fullname) { | ||
| username = `https://orcid.org/orcid-search/search?searchQuery=${fullname}`; |
There was a problem hiding this comment.
The fallback ORCID search URL is built with the raw fullname string. Because names can contain spaces and punctuation, this can generate invalid URLs (or unexpected query parsing). URL-encode the query value (e.g., via encodeURIComponent) before interpolating it into searchQuery=.
| username = `https://orcid.org/orcid-search/search?searchQuery=${fullname}`; | |
| username = `https://orcid.org/orcid-search/search?searchQuery=${encodeURIComponent(fullname)}`; |
| return Promise.reject(new OpenReviewError({ name: 'Error', message: `The author name ${authorName} from index ${authorIndex} doesn't match with the names listed in your profile` })); | ||
| const authorUsername = publication.content.authors.value[authorIndex]?.username; | ||
| if (!usernames.some(username => username === authorUsername)) { | ||
| return Promise.reject(new OpenReviewError({ name: 'Error', message: `The author name ${authorUsername} from index ${authorIndex} doesn't match with the names listed in your profile` })); |
There was a problem hiding this comment.
This error message says "author name" but the value being checked and interpolated here is the author's username (authorUsername). Updating the message to refer to "author username" (or using the fullname instead) will make failures easier to understand.
| return Promise.reject(new OpenReviewError({ name: 'Error', message: `The author name ${authorUsername} from index ${authorIndex} doesn't match with the names listed in your profile` })); | |
| return Promise.reject(new OpenReviewError({ name: 'Error', message: `The author username ${authorUsername} from index ${authorIndex} doesn't match with the usernames listed in your profile` })); |
| const userProfile = profiles[0]; | ||
|
|
||
| const usernames = userProfile.content.names.map(name => name.username); |
There was a problem hiding this comment.
This pre-process script is not referenced anywhere in the repository (no invitation points to process/author_removal_pre_process.js). If Author_Removal is intended to use it, wire it up in openreview/profile/management.py; otherwise consider removing it to avoid maintaining dead code.
| const usernames = userProfile.content.names.map(name => name.username); | |
| const names = Array.isArray(userProfile?.content?.names) ? userProfile.content.names : []; | |
| const usernames = names | |
| .map(name => name && name.username) | |
| .filter(username => !!username); |
| note = josiah_client.get_note(edit['note']['id']) | ||
| assert note.external_ids == ['doi:10.1103/physreva.109.022426'] | ||
| assert '~Josiah_Couch1' == note.content['authorids']['value'][0] | ||
| assert note.content['authors']['value'][0] == {'fullname': 'Josiah Couch', 'username': '~Josiah_Couch1'} |
There was a problem hiding this comment.
In this test, there is an Authorship_Claim attempt shortly after this assertion that omits the newly-added author_name field. Since author_name is required by the updated invitation schema, that claim is likely to fail schema validation before the pre-process error you’re asserting is raised. Update the failing claim attempt to include author_name (or make author_name optional in the invitation if omission should be supported).
…penreview-py into feature/new-dblp-invitation
…penreview-py into feature/new-dblp-invitation
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.
Comments suppressed due to low confidence (1)
openreview/profile/process/deprecated_dblp_record_process.js:44
abstractErroris assigned in the catch block but never used after removing the throw at the end of the function. This is now dead code and can be removed (and the catch can just log), or reintroduce the throw if failures should still fail the queue edit.
const html = note.content.html?.value;
let abstractError = false;
try {
if (html) {
const { abstract, pdf, error } = await Tools.extractAbstract(html);
console.log('abstract: ' + abstract);
console.log('pdf: ' + pdf);
console.log('error: ' + error);
if (abstract) {
note.content.abstract = { value: abstract };
}
if (pdf) {
note.content.pdf = { value: pdf };
}
} else {
console.log('html field is empty');
}
} catch (error) {
console.log('error: ' + JSON.stringify(error?.toJson?.()));
abstractError = error;
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
author{} object schema
Summary
Migrates the DBLP / arXiv / ORCID public-article import flow from the legacy parallel
content.authors+content.authoridsstring arrays to the newauthor{}object schema, wherecontent.authors.valueis a single list of{ fullname, username }objects. Updates every Public_Article invitation, pre-process, and post-process that touches the authors field, and extends the profile name-removal flow so names linked to new-schema publications are rewritten (or refused for auto-accept) correctly.Motivation
The API's new
author{}invitation type stores authorship as a single list of typed objects instead of two parallel arrays that callers had to zip and keep in sync. It fixes a class of bugs whereauthors[i]andauthorids[i]drifted out of order, and it cleanly supports partial authorship info (e.g. a DBLP author with no OpenReview profile yet). All existing Authorship_Claim / Author_Removal flows had to be rebuilt against the new shape, and the name-removal support flow had to learn the new schema so it can still rewrite usernames on publications that contain a removed alternate name.Changes
Record ingestion: DBLP / arXiv / ORCID
{ fullname, username }objects and dropcontent.authorids. Preserve any usernames already claimed on the existing note.contributors.contributor[*].credit-name), using a localformatCreditNamehelper that converts"Last, First"into"First Last". Marked with TODO comments to be replaced once@openreview/clientships an equivalent fix.extractAbstract/ post-edit block intry/catchso a transient fetch failure is logged rather than crashing the whole process function.Public_Article invitations
Authorship_ClaimandAuthor_Removalinvitations now:author_namecontent field alongsideauthor_index/author_id.{ fullname, username }object at the given index (hiddenconst.replace), instead of writing the username intoauthorids[index].Author_Removalnow binds toauthor_removal_pre_process.js(previously re-used the coreference pre-process).authors: string[]+authorids: string[]withauthors: author{}(typed{ fullname, username }).openreview.net/Public_Article/-/Editgainsopenreview.net/Supportas an invitee/reader so the name-removal decision process can sign corrective edits.Pre-process validation
Tools.prettyId(author_id) === author_name, reads the publication's existing fullname fromauthors[index].fullname, and filters out empty usernames in the profile lookup.Tools.prettyId(authors[index].username) === author_namebefore removing, so a typo or stale index is rejected loudly.externalIdto avoid duplicate keys; no longer throw on missing abstract).Profile name removal
authors.valueis a list of dicts (new schema) or a list of strings (legacy), and routes writes through the correct shape. For new-schema notes it rewrites{ fullname, username }in place to the profile's preferred name / id; for legacy notes the existing dual-array update is unchanged.Note.authorsproperty, so a request whose username is only linked via new-schema DBLP records is not auto-accepted.SDK
Note.authorsproperty. Returns a canonical[{ fullname, username }]list regardless of which schema the note stores. Serialization (to_json) is untouched, so round-trips preserve the wire format.Tests
New
test_remove_name_with_dblp_publication: end-to-end test where a user imports aopenreview.net/Public_Article/DBLP.org/-/Recordnote, claims authorship with an alternate name viaAuthorship_Claim, requests removal of the alternate name, support accepts, and the DBLP record'sauthors[0]is rewritten from{ fullname: "Edith Alternate Last", username: "~Edith_Alternate_Last1" }to{ fullname: "Edith Last", username: "~Edith_Last1" }.Updated
authors: [{fullname, username}]objects and assert against the new shape;Authorship_Claim/Author_Removalcalls passauthor_name;error=Trueremoved from theawait_queue_editcalls that used to tolerate post-process failures.authors/authoridswhile PC search still returns them, verifying the new indexing against the migrated schema.Deletion-0-1process queue now emitted by the workflow.Test plan
pytest tests/test_profile_management.pypasses end-to-end.pytest tests/test_profile_management.py::TestProfileManagement::test_remove_name_with_dblp_publicationpasses (covers the name-removal → DBLP rewrite path).pytest tests/test_iclr_conference_v2.pypasses (indexing assertions).pytest tests/test_abstract_deadline.pypasses.authors[claimed_index]."First Last"order for records where the upstream returns"Last, First"(covered by TODO block inorcid_record_process.js).