Skip to content

Add CLAUDE.md documentation for AI assistants#10

Open
MiPlayer123 wants to merge 6 commits intoWellyZhang:masterfrom
MiPlayer123:claude/claude-md-mivgx1g1ib55jbcc-0116CmpvNU1YV9nh14UFQpY5
Open

Add CLAUDE.md documentation for AI assistants#10
MiPlayer123 wants to merge 6 commits intoWellyZhang:masterfrom
MiPlayer123:claude/claude-md-mivgx1g1ib55jbcc-0116CmpvNU1YV9nh14UFQpY5

Conversation

@MiPlayer123
Copy link
Copy Markdown

Create comprehensive documentation covering:

  • Project overview and structure
  • Key concepts (AoT grammar, configurations, rules)
  • Development environment and requirements
  • Common commands for dataset generation and training
  • Dataset format specifications
  • Code patterns and conventions
  • Common issues and external resources

Create comprehensive documentation covering:
- Project overview and structure
- Key concepts (AoT grammar, configurations, rules)
- Development environment and requirements
- Common commands for dataset generation and training
- Dataset format specifications
- Code patterns and conventions
- Common issues and external resources
Changes made:
- Replace scipy.misc.comb with scipy.special.comb (AoT.py, sampling.py)
- Fix integer division / to // where needed (const.py, Rule.py, constraints.py)
- Convert print statements to print() functions (main.py)
- Wrap dict.keys() in list() for random.choice compatibility (main.py)
- Convert range() to list(range()) where list operations are used (Rule.py)
- Fix RLE encoding to use proper binary mask (api.py)
- Fix XML serialization to return unicode string (serialize.py)
- Update CLAUDE.md to reflect Python 3 compatibility

Dataset generation now works with Python 3.7+ and generates correct
NPZ and XML files for all 7 figure configurations.
Provides helpful error message if user tries to run with Python 2.
- Generate 350 total samples (50 per configuration x 7 configurations)
- Add DATASET_GENERATION.md documenting the generation process
- Include all generated NPZ and XML files

Dataset generated with:
  python src/dataset/main.py --num-samples 50 --save-dir ./dataset --seed 42

All 7 configurations achieved 100% solver accuracy.
Generate 10 visualization images per configuration (70 total):
- Shows 3x3 context grid (8 panels + empty answer cell)
- Shows 8 answer choices below
- Correct answer highlighted with darker border

Configurations visualized:
- center_single
- distribute_four
- distribute_nine
- left_center_single_right_center_single
- up_center_single_down_center_single
- in_center_single_out_center_single
- in_distribute_four_out_center_single
Usage examples:
  # Single file
  python src/dataset/visualize.py -i sample.npz -o output.png

  # Directory of samples
  python src/dataset/visualize.py --input-dir ./dataset/center_single --output-dir ./viz

  # Full dataset (all configurations)
  python src/dataset/visualize.py --dataset-dir ./dataset --output-dir ./viz -n 10

Features:
- Visualizes 3x3 context grid with answer choices below
- Highlights correct answer with darker border
- Supports single file, directory, or full dataset modes
@WellyZhang
Copy link
Copy Markdown
Owner

Hi Mikul,

Thanks for the PR. In general, I tend to keep the repo as it is to properly reflect the state of the paper back then. Maybe you want to consider forking the repo and committing directly to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants