Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP, Agent] Extension editing experiments: removing edit_file_by_line #2846

Closed
wants to merge 54 commits into from

Conversation

xingyaoww
Copy link
Contributor

@xingyaoww xingyaoww commented Jul 7, 2024

What is the problem that this fixes or functionality that this introduces? Does it fix any open issues?

Extension experiment of #2722.

Check #2722 (comment) for more detials.

Give a brief summary of what the PR does, explaining any non-trivial design decisions

Other references

Main purpose of this PR for now is to help track experiments and run CI. No review is required now.

@xingyaoww
Copy link
Contributor Author

This PR is tested and merged into #2722. See result here: #2722 (comment)

@xingyaoww xingyaoww closed this Jul 10, 2024
@tobitege
Copy link
Collaborator

Please bear with me, but the 1 case that got solved more than before, is it actually because of the changed editing?
Or just Sonnet being Sonnet and sometimes approaching things differently.

@xingyaoww
Copy link
Contributor Author

We keep temperature=0.0 for all experiments; that is, we try to control for as randomness as possible.
The same model w/ the edit_file_by_line does degrade the overall performance more (69 i believe) and does show something that if you have two methods of editing, it sometimes confuses the model & may not be the best way.

Plus, in this new run, we have 20+ instances that Sonnet wasn't able to run because of ContentPolicyViolationError, not because of anything else, whereas i remember our previous run have fewer or no these error (they probably upgraded these filters fairly recently). So i'd guess at least this PR is benefiting more than hurting, especially we did the controlled experiment (w/ and w/o edit_file_by_line + the same model).

@xingyaoww xingyaoww deleted the xw/edit-remove-by-line branch August 6, 2024 22:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants