-
-
Notifications
You must be signed in to change notification settings - Fork 10
Integrate BugsInPy #184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
t-sorger
wants to merge
52
commits into
master
Choose a base branch
from
BugsInPy
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Integrate BugsInPy #184
Changes from 34 commits
Commits
Show all changes
52 commits
Select commit
Hold shift + click to select a range
a09695d
add BugsInPy submodule
t-sorger c9384d5
add initial BugsInPybug.py
t-sorger ce48490
add initial BugsInPy.py to benchmark
t-sorger 865975b
add BugsInPy to core utils
t-sorger e8976c5
add initial tests for BugsInPy; fix typo
t-sorger 9a3325d
add BugsInPy submodule
t-sorger 96d79c5
add initial BugsInPybug.py
t-sorger 83b35cd
add initial BugsInPy.py to benchmark
t-sorger 0cf0179
add BugsInPy to core utils
t-sorger e09839c
add initial tests for BugsInPy; fix typo
t-sorger f335bdf
add test implementation for BugsInPybug
t-sorger 2bc479a
fix bin path issues
t-sorger bd08ec1
lint code
t-sorger 11600a3
rework tests for BugsInPy
t-sorger 1cc7bc6
update submodules
t-sorger 0d28f9d
Merge branch 'BugsInPy' of github.com:ASSERT-KTH/repairbench-framewor…
t-sorger d3de871
add BugsInPy submodule
t-sorger 56f4502
add initial BugsInPybug.py
t-sorger 8274a8d
add initial BugsInPy.py to benchmark
t-sorger 63f5834
add BugsInPy to core utils
t-sorger 8e761a6
add initial tests for BugsInPy; fix typo
t-sorger 41821d4
add test implementation for BugsInPybug
t-sorger 28e4c9a
fix bin path issues
t-sorger 21420fd
lint code
t-sorger 5962796
rework tests for BugsInPy
t-sorger ea287fa
update submodules
t-sorger 17c438d
Merge branch 'BugsInPy' of github.com:ASSERT-KTH/repairbench-framewor…
t-sorger 7177e86
adds RichBug and fixes process calls
t-sorger 7a195e0
add checks and fix path issues
t-sorger 1c2f662
fix code and first tests
t-sorger 1845b6d
fix error in tests
t-sorger f0cfa76
lint code
t-sorger 1c1ea5e
start adding instruct test and new python utils
t-sorger 1e0ffd0
update python.py
t-sorger edd053f
update Python utils and comment other test cases
t-sorger c74c397
add InfillingPromptingPython
t-sorger b679250
update utils for Python
t-sorger 994e21e
add test infilling for BugsInPy codellama
t-sorger 4d3561c
lint files
t-sorger c583a39
uncomment other infilling tests
t-sorger 779340a
add initial files for language_utils
t-sorger 76272cf
add get_language_utils method
t-sorger b1e684f
add usage of LanguageUtils for infilling
t-sorger b72565c
add first docker adoptations
t-sorger 5507ee7
update BugsInPy for Docker
t-sorger 029538a
lint files
t-sorger 04a0fc0
update steup
t-sorger b629e73
add sample/instruct test for BugsInPy
t-sorger 70e7251
add sample/infilling test for BugsInPy
t-sorger 6dd1290
add evaluation tests for BugsInPy
t-sorger 7c21a6d
add missing tests for RichBug implementation of BugsInPy
t-sorger 4963e5b
remove prints
t-sorger File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Submodule gitbug-java
updated
5 files
| +27 −80 | README.md | |
| +1 −1 | gitbug-java | |
| +3 −13 | gitbug/bug.py | |
| +277 −501 | poetry.lock | |
| +3 −3 | pyproject.toml |
Submodule cache
updated
from 06cd07 to 0d3f97
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,136 @@ | ||
| from pathlib import Path | ||
| from typing import Optional | ||
| from io import StringIO | ||
| from elleelleaime.core.benchmarks.benchmark import Benchmark | ||
| from elleelleaime.core.benchmarks.BugsInPy.BugsInPybug import BugsInPyBug | ||
|
|
||
| import subprocess | ||
| import logging | ||
|
|
||
| # import tqdm | ||
| import re | ||
|
|
||
| # import os | ||
| import pandas as pd | ||
|
|
||
|
|
||
| class BugsInPy(Benchmark): | ||
| """ | ||
| The class for representing the BugsInPy benchmark. | ||
| """ | ||
|
|
||
| def __init__(self, path: Path = Path("benchmarks/BugsInPy").absolute()) -> None: | ||
| super().__init__("BugsInPy", path) | ||
|
|
||
| def get_bin(self, options: str = "") -> Optional[str]: | ||
| return f'{Path(self.path, "framework/bin/")}' | ||
|
|
||
| def initialize(self) -> None: | ||
| """ | ||
| Initializes the BugsInPy benchmark object by collecting the list of all projects and bugs. | ||
| """ | ||
| logging.info("Initializing BugsInPy benchmark...") | ||
|
|
||
| # Get all project names | ||
| run = subprocess.run( | ||
| f"ls {self.path}/projects", | ||
| shell=True, | ||
| capture_output=True, | ||
| check=True, | ||
| ) | ||
| project_names = { | ||
| project_name.decode("utf-8") for project_name in run.stdout.split() | ||
| } | ||
| logging.info("Found %3d projects" % len(project_names)) | ||
|
|
||
| # Get all bug names for all project_name | ||
| bugs = {} | ||
| # for project_name in tqdm.tqdm(project_names): | ||
| for project_name in project_names: | ||
| run = subprocess.run( | ||
| f"ls {self.path}/projects/{project_name}/bugs", | ||
| shell=True, | ||
| capture_output=True, | ||
| check=True, | ||
| ) | ||
| # bugs[project_name] = { | ||
| # int(bug_id.decode("utf-8")) for bug_id in run.stdout.split() | ||
| # } | ||
|
|
||
| bugs[project_name] = set() | ||
| for bug_id in run.stdout.split(): | ||
| try: | ||
| bug_id_int = int(bug_id.decode("utf-8")) | ||
| bugs[project_name].add(bug_id_int) | ||
| except ValueError: | ||
| logging.warning( | ||
| f"Skipping invalid bug ID: {bug_id.decode('utf-8')}" | ||
| ) | ||
|
|
||
| logging.info( | ||
| "Found %3d bugs for project %s" | ||
| % (len(bugs[project_name]), project_name) | ||
| ) | ||
|
|
||
| # Initialize dataset | ||
| for project_name in project_names: | ||
| # Create a DataFrame to store the failing test cases and trigger causes | ||
| df = pd.DataFrame(columns=["bid", "tests", "errors"]) | ||
|
|
||
| for bug_id in bugs[project_name]: | ||
| # Extract ground truth diff | ||
| diff_path = f"benchmarks/BugsInPy/projects/{project_name}/bugs/{bug_id}/bug_patch.txt" | ||
| with open(diff_path, "r", encoding="ISO-8859-1") as diff_file: | ||
| diff = diff_file.read() | ||
|
|
||
| # Extract failing test cases and trigger causes | ||
| # failing_test_cases = df[df["bug_id"] == bug_id]["tests"].values[0] | ||
| # trigger_cause = df[df["bug_id"] == bug_id]["errors"].values[0] | ||
|
|
||
| # Moved into BugsInPybug.py | ||
| # # Checkout the bug | ||
| # checkout_run = subprocess.run( | ||
| # f"{self.benchmark.get_bin()}bugsinpy-checkout -p {self.project_name} -v {self.version_id} -i {self.bug_id}", | ||
| # shell=True, | ||
| # capture_output=True, | ||
| # check=True, | ||
| # ) | ||
|
|
||
| # # Compile and test the bug | ||
| # path = f"{self.benchmark.get_bin()}/temp/{project_name}" | ||
| # checkout_compile = subprocess.run( | ||
| # f"{self.benchmark.get_bin()}bugsinpy-compile -w {path}", | ||
| # shell=True, | ||
| # capture_output=True, | ||
| # check=True, | ||
| # ) | ||
|
|
||
| # checkout_compile = subprocess.run( | ||
| # f"{self.benchmark.get_bin()}bugsinpy-test -w {path}", | ||
| # shell=True, | ||
| # capture_output=True, | ||
| # check=True, | ||
| # ) | ||
|
|
||
| # # Check with default path | ||
| # fail_path = f"{self.benchmark.get_bin()}/temp/{project_name}/bugsinpy_fail.txt" | ||
| # with open(fail_path, "r", encoding="ISO-8859-1") as fail_file: | ||
| # failing_tests_content = fail_file.read() | ||
|
|
||
| # # Use a regular expression to extract the test name and its context | ||
| # pattern = r"FAIL: ([\w_.]+ \([\w_.]+\))" | ||
| # matches = re.findall(pattern, failing_tests_content) | ||
|
|
||
| # # Store the results in a dictionary if needed | ||
| # failing_tests = {"failing_tests": matches} | ||
|
|
||
| self.add_bug( | ||
| BugsInPyBug( | ||
| self, | ||
| project_name=project_name, | ||
| bug_id=bug_id, | ||
| version_id=0, # 0 buggy -- is this always the case? | ||
| ground_truth=diff, | ||
| failing_tests=None, # needs to be checked out for this? | ||
| ) | ||
| ) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,100 @@ | ||
| import subprocess | ||
| import shutil | ||
| import re | ||
| import os | ||
|
|
||
| from elleelleaime.core.benchmarks.benchmark import Benchmark | ||
|
|
||
| # TODO: Implement as `RichBug` later on | ||
| from elleelleaime.core.benchmarks.bug import RichBug | ||
| from elleelleaime.core.benchmarks.test_result import TestResult | ||
| from elleelleaime.core.benchmarks.compile_result import CompileResult | ||
|
|
||
|
|
||
| class BugsInPyBug(RichBug): | ||
| """ | ||
| The class for representing BugsInPy bugs | ||
| """ | ||
|
|
||
| def __init__( | ||
| self, | ||
| benchmark: Benchmark, | ||
| project_name: str, | ||
| bug_id: str, | ||
| version_id: str, # 1 fixed, 0 buggy | ||
| ground_truth: str, | ||
| failing_tests: dict[str, str], | ||
| ) -> None: | ||
| self.project_name = project_name | ||
| self.bug_id = bug_id | ||
| self.version_id = version_id | ||
| super().__init__( | ||
| benchmark, | ||
| f"{project_name}-{bug_id}", | ||
| ground_truth, | ||
| failing_tests, | ||
| # ground_truth_inverted=True, # TODO: TypeError: Bug.__init__() got multiple values for argument 'ground_truth_inverted' | ||
| ) | ||
|
|
||
| def checkout(self, path: str, fixed: bool = False) -> bool: | ||
| project_name, bug_id = path.rsplit("-", 1) | ||
|
|
||
| # Remove the directory if it exists | ||
| shutil.rmtree(path, ignore_errors=True) | ||
|
|
||
| # Checkout the bug | ||
| checkout_run = subprocess.run( | ||
| f"{self.benchmark.get_bin()}/bugsinpy-checkout -p {project_name} -v {fixed} -i {bug_id}", # 1 fixed, 0 buggy | ||
| # f"{self.benchmark.get_bin()}/bugsinpy-checkout -p {self.project_name} -v {self.version_id} -i {self.bug_id}", | ||
| shell=True, | ||
| capture_output=True, | ||
| check=True, | ||
| ) | ||
|
|
||
| # Convert line endings to unix | ||
| dos2unix_run = subprocess.run( | ||
| f"find {path} -type f -print0 | xargs -0 -n 1 -P 4 dos2unix", | ||
| shell=True, | ||
| capture_output=True, | ||
| check=True, | ||
| ) | ||
|
|
||
| return checkout_run.returncode == 0 and dos2unix_run.returncode == 0 | ||
|
|
||
| def compile(self, path: str) -> CompileResult: | ||
| project_name, bug_id = path.rsplit("-", 1) | ||
| run = subprocess.run( | ||
| f"{self.benchmark.get_bin()}/bugsinpy-compile -w {self.benchmark.get_bin()}/temp/{project_name}", | ||
| shell=True, | ||
| capture_output=True, | ||
| check=True, | ||
| ) | ||
|
|
||
| return CompileResult(run.returncode == 0) | ||
|
|
||
| def test(self, path: str) -> TestResult: | ||
| project_name, bug_id = path.rsplit("-", 1) | ||
|
|
||
| run = subprocess.run( | ||
| f"{self.benchmark.get_bin()}/bugsinpy-test -w {self.benchmark.get_bin()}/temp/{project_name}", | ||
| shell=True, | ||
| capture_output=True, | ||
| check=False, | ||
| ) | ||
|
|
||
| # Decode the output and extract the last line | ||
| stdout_lines = run.stdout.decode("utf-8").strip().splitlines() | ||
| last_line = stdout_lines[-1] if stdout_lines else "" | ||
|
|
||
| if "OK" in last_line: | ||
| success = True | ||
| elif "FAILED" in last_line: | ||
| success = False | ||
|
|
||
| return TestResult(success) | ||
|
|
||
| def get_src_test_dir(self, path: str) -> str: | ||
| project_name, bug_id = path.rsplit("-", 1) | ||
| path = f"{self.benchmark.get_bin()}/temp/{project_name}/test" | ||
|
|
||
| return path |
Empty file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,75 @@ | ||
| from typing import Optional, Tuple, List | ||
| from unidiff import PatchSet | ||
| from uuid import uuid4 | ||
| import uuid | ||
| from pathlib import Path | ||
| import logging | ||
| import getpass, tempfile, difflib, shutil | ||
| import subprocess | ||
| import re | ||
| import ast | ||
|
|
||
| from elleelleaime.core.benchmarks.bug import Bug, RichBug | ||
|
|
||
|
|
||
| def extract_functions(source_code): | ||
| # Parse the source code into an AST | ||
| tree = ast.parse(source_code) | ||
|
|
||
| # Extract all function definitions | ||
| functions = [node for node in tree.body if isinstance(node, ast.FunctionDef)] | ||
|
|
||
| # Convert the function nodes back to source code | ||
| function_sources = [ast.get_source_segment(source_code, func) for func in functions] | ||
|
|
||
| return function_sources | ||
|
|
||
|
|
||
| def extract_single_function(bug: Bug) -> Optional[Tuple[str, str]]: | ||
| """ | ||
| Extracts the buggy and fixed code of single-function bugs. | ||
| Returns None is bug is not single-function | ||
|
|
||
| Args: | ||
| bug (Bug): The bug to extract the code from | ||
|
|
||
| Returns: | ||
| Optional[Tuple[str, str]]: None if the bug is not single-function, otherwise a tuple of the form (buggy_code, fixed_code) | ||
| """ | ||
| project_name, _ = bug.get_identifier().rsplit("-", 1) | ||
| path = f"./benchmarks/BugsInPy/projects/{project_name}" | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| print(f"{path=}") | ||
|
|
||
| try: | ||
| # Checkout the buggy version of the bug | ||
| bug.checkout(bug.get_identifier(), fixed=0) | ||
| bug.compile(bug.get_identifier()) | ||
| # Test fixed version | ||
| # test_result = bug.test(bug.get_identifier()) | ||
|
|
||
|
|
||
| path_bin = f"./benchmarks/BugsInPy/framework/bin/temp/{project_name}" | ||
| with open(Path(path_bin, "test", f"test_aes.py")) as f: | ||
| buggy_code = f.read() | ||
|
|
||
| buggy_functions = extract_functions(buggy_code) | ||
|
|
||
| # Checkout the fixed version of the bug | ||
| bug.checkout(bug.get_identifier(), fixed=1) | ||
| bug.compile(bug.get_identifier()) | ||
|
|
||
| with open(Path(path_bin, "test", f"test_aes.py")) as f: | ||
| fixed_code = f.read() | ||
|
|
||
| buggy_functions = extract_functions(buggy_code) | ||
| fixed_functions = extract_functions(fixed_code) | ||
|
|
||
| assert len(buggy_functions) == len(fixed_functions) | ||
|
|
||
| return buggy_code, fixed_code | ||
|
|
||
| finally: | ||
| # Remove the checked-out bugs | ||
| # shutil.rmtree(path_bin, ignore_errors=True) | ||
| pass | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider logging the specific
ValueErrorexception message for better debugging. This will help identify the cause of the invalid bug ID.For example, you can log
str(e)to capture the error message.