Enabling sexual orientation attribute#233
Conversation
|
@PS5138 thank you for creating this PR! We will review in the next couple of weeks and provide any feedback we have. |
dylanbouchard
left a comment
There was a problem hiding this comment.
Thank you for this PR! Please provide screenshots or a rigorous demonstration dataset showing that the counterfactual prompt creation works in a wide variety of cases
| SEXUAL_ORIENTATION_WORDS_REQUIRING_CONTEXT: List[str] = [ | ||
| "gay", | ||
| "straight", | ||
| "pride", |
There was a problem hiding this comment.
Please check that these align with the context/person words that exist in the word list. Have you tested substitutions with these words? It would be helpful to have screenshots to show how these substitutions are handled
There was a problem hiding this comment.
Removed "pride": when checked against the existing PERSON_WORDS list, "pride" generates entirely unnatural token-pairs such as "pride accountant", "pride actress", "pride manager" etc. None of these phrases appear in real text so it was effectively matching nothing. "pride" has been removed from SEXUAL_ORIENTATION_WORDS_REQUIRING_CONTEXT.
"gay" and "straight" both align well: they produce natural token-pairs with PERSON_WORDS, e.g. "gay man", "gay employee", "gay actor", "straight woman", "straight person". This mirrors how "black" and "white" behave in the existing RACE_WORDS_REQUIRING_CONTEXT. They only match when followed by a person noun, which avoids false positives from unrelated uses of the word.
Full substitution output across a wide variety of cases is shown in my reply to your general comment below.
- Remove "pride" from SEXUAL_ORIENTATION_WORDS_REQUIRING_CONTEXT; it
generates unnatural token-pairs with PERSON_WORDS (e.g. "pride
accountant") that never appear in real text
- Fix plural preservation in _replace_sexual_orientation so that plural
source words (e.g. "homosexuals") map to plural targets ("heterosexuals")
rather than the singular form
- Clean up for-loop style in STRICT_SEXUAL_ORIENTATION_WORDS construction
- Expand tests to cover plural substitution and REQUIRING_CONTEXT
word + person-noun pairs
- Remove personal notebook entry from .gitignore
|
Here is a demonstration of parse_texts and create_prompts running across a wide variety of inputs. I've also summarised all the changes made since the initial submission. Changes made:
Known limitation (consistent with existing race behaviour): REQUIRING_CONTEXT words only match when followed by a person noun. "She is gay." (standalone, no following noun) is not detected — this is the same behaviour as "black" and "white" in RACE_WORDS_REQUIRING_CONTEXT. DEMO SCRIPT: """ from langfair.generator.counterfactual import CounterfactualGenerator Minimal LLM object needed to instantiate the generator(no real API calls are made — we're only demonstrating prompt creation)llm = AzureChatOpenAI( prompts = [ print("=" * 70) print() original = result["original_prompt"] print(f"\nPrompts containing sexual orientation words: {len(original)}") for i, orig in enumerate(original): OUTPUT: ======================================================================
|
|
Please let me know if you'd like any additional test cases or further demonstration of edge cases; happy to add more coverage before this is merged! |
Closes #142
Hi! This is my second open-source contribution, so I appreciate your patience. I've done my best to follow the existing code patterns and contributing guidelines, but if I've missed anything or if there are changes you'd like me to make, please let me know and I'll be happy to work on it.
Description
This PR adds sexual orientation as a third protected attribute to the CounterfactualGenerator, alongside the existing gender and race attributes. The implementation follows the same substitution strategy used for race (all-to-one replacement) with four groups: heterosexual, gay, lesbian, and bisexual.
Changes in detail
langfair/constants/word_lists.py
langfair/generator/counterfactual.py
langfair/auto/auto.py
tests/test_counterfactualgenerator.py
Contributor License Agreement
Tests
All tests passed.
Documentation
Screenshots
N/A