[Agent] Improve edits by adding back `append_file` #2722

xingyaoww · 2024-07-01T14:51:51Z

What is the problem that this fixes or functionality that this introduces? Does it fix any open issues?

Dependency: #2805

This is a follow-up of #2685.

Give a brief summary of what the PR does, explaining any non-trivial design decisions

Per discussion with @tobitege in #2685, this PR:

~~add back edit_file_by_line and make it co-exists with edit_file_by_replace~~ it shows degraded performance ([Agent] Improve edits by adding back append_file #2722 (comment))
add back append_file so that agent can easily append content to the end due to concern mentioned here.
add '(this is the end of the file)\n' and '(this is the beginning of the file)\n' to tell the agent when it has scrolled to the bottom/top of the file
fix the extra newline produced by insert_content_at_line introduced in [Agent] (Potentially) improve Editing using diff #2685

TODO:

This requires [Agent] (Potentially) improve Editing using diff #2685 to be merged first
Run SWE-Bench Lite to validate performance.

Other references

…inter errors (best attempt)

xingyaoww · 2024-07-07T18:40:12Z

@neubig Got the claude-3.5 results out.. There's slight degradation on claude-3.5 results (69/300=23%) compared to 1.7 (73/300=24.3%). I suspect it might be caused by the addition of edit by line number in addition to edit by replace.

Do you think we should remove the edit by line and run eval again, or we can just merge as is, and then try to remove it further in the next PR?

EDIT: I plan to try removing edit_file_by_line and see if it improves things - if so, we can just do it with this PR.

…er-improve-edit

neubig · 2024-07-07T20:22:18Z

I agree that if accuracy has gone down we should probably re-try the eval.

This reverts commit 06fd2c5.

…er-improve-edit

xingyaoww · 2024-07-10T21:13:07Z

Improved results on this PR - after merging #2846 that removes edit_file_by_line (this does hurt!).

74 resolved out of 300-27=273 inference instances (~27% solve rate - will be 24.6% if we consider 300 instances).
Those 27 instances are unable to run on claude-3.5-sonnet due to the following error:

litellm.BadRequestError: litellm.ContentPolicyViolationError: VertexAIException ContentPolicyViolationError - 
{"type":"error","error":{"type":"invalid_request_error","message":"Output blocked by content filtering policy"}}

Anthropic updated their article last week (https://support.anthropic.com/en/articles/9205721-why-am-i-receiving-an-output-blocked-by-content-filtering-policy-error), suggesting that '.... they generally arise from Anthropic’s efforts to prevent Claude from being used to replicate or regurgitate pre-existing materials...'

Maybe claude is trained on some repository in the SWE-Bench test set, and managed to reproduce them exactly during inference, which is blocked by anthropic's latest update feature?

Nevertheless, 74 resolved, compared to the previous run, 73 resolved, which is still a minor improvement, which I think worth merge this in.

…er-improve-edit

xingyaoww · 2024-07-10T23:36:06Z

Let me know if people have more concerns here (cc @tobitege) -- if not, i plan to merge this sometime tomorrow (in the next 12-18 hours) so we can go ahead test #2489 with the updated "edit" function.

yufansong

LGTM

tobitege · 2024-07-11T03:04:28Z

opendevin/runtime/plugins/agent_skills/agentskills.py

@@ -230,17 +235,17 @@ def _cur_file_header(current_file, total_lines) -> str:

 @update_pwd_decorator
 def open_file(
-    path: str, line_number: int | None = 1, context_lines: int | None = 100
+    path: str, line_number: int | None = 1, context_lines: int | None = WINDOW


WINDOW should be 300?

Good catch! It is actually 100 - i experimented earlier with 300 and it didn't go well so i reverted back to 100, but forgot to revert the doc back :(

There are 2 more locations of 300, I think

tobitege · 2024-07-11T03:12:19Z

Let me know if people have more concerns here (cc @tobitege) -- if not, i plan to merge this sometime tomorrow (in the next 12-18 hours) so we can go ahead test #2489 with the updated "edit" function.

Of course, go ahead. 😃

xingyaoww added 15 commits June 28, 2024 08:29

add replace-based block edit & preliminary test case fix

d3a2d10

further fix the insert behavior

b03fd2c

make edit only work on first occurence

3700478

bump codeact version since we now use new edit agentskills

8138371

update prompt for new agentskills

45283e1

update integration tests

790b94f

make run_infer.sh executable

8e630c5

remove code block for edit_file

693a53f

update integration test for prompt changes

7fcba43

default to not use hint for eval

22de134

fix insert emptyfile bug

4ec06f0

throw value error when to_replace is empty

2c85f43

make _edit_or_insert_file return string so we can try to fix some l…

621a158

…inter errors (best attempt)

add todo

4cb8bdb

update integration test

20cdd32

xingyaoww mentioned this pull request Jul 1, 2024

[Agent] (Potentially) improve Editing using diff #2685

Merged

xingyaoww added 6 commits July 1, 2024 23:30

fix sandbox test for this PR

db14924

fix inserting with additional newline

00f57e9

rename to edit_file_by_replace

a2f0916

add back edit_file_by_line

cbf2298

update prompt for new editing tool

fb0b9de

fix integration tests

cf9eef9

xingyaoww force-pushed the xw/further-improve-edit branch from 25577be to cf9eef9 Compare July 1, 2024 22:30

xingyaoww added 7 commits July 2, 2024 06:31

bump codeact version since there are more changes

a251122

add back append file

0aa476c

fix current line for append

ab47585

fix append unit tests

eba37d5

change the location where we show edited line no to agent and fix tests

22274be

update integration tests

a45953d

fix global window size affect by open_file bug

f85c847

neubig assigned xingyaoww Jul 7, 2024

xingyaoww added 3 commits July 8, 2024 02:57

Merge commit 'adf1a0d55654d0065b615d0b4a6534cc5478b732' into xw/furth…

4c6fa51

…er-improve-edit

fix integration tests

5247bfa

remove edit file by line

949b38d

xingyaoww mentioned this pull request Jul 7, 2024

[WIP, Agent] Extension editing experiments: removing edit_file_by_line #2846

Closed

xingyaoww and others added 6 commits July 8, 2024 04:41

fix integration tests

b4ac713

add instruction to avoid hanging

06fd2c5

Revert "add instruction to avoid hanging"

e690830

This reverts commit 06fd2c5.

handle content policy violation error

57823b3

Merge commit 'e6908307b436b07ede7ff3c093aa79db2ce6ee9c' into xw/furth…

5be8c05

…er-improve-edit

Merge commit '57823b3846cc297933282b181d6eed2c7511bcdc' into xw/furth…

7c67798

…er-improve-edit

xingyaoww mentioned this pull request Jul 10, 2024

[Arch] Implement EventStream Runtime Client with Jupyter Support using Agnostic Sandbox #2879

Merged

3 tasks

Merge commit '456690818c94a266935888f1e56e0afa2c4d5219' into xw/furth…

8df6970

…er-improve-edit

xingyaoww force-pushed the xw/further-improve-edit branch from 04b4acf to 8df6970 Compare July 10, 2024 21:18

fix integration tests

b474bbd

yufansong approved these changes Jul 11, 2024

View reviewed changes

tobitege reviewed Jul 11, 2024

View reviewed changes

fix typo in prompt - the window is 100

f02c145

xingyaoww force-pushed the xw/further-improve-edit branch from f9c1785 to f02c145 Compare July 11, 2024 14:12

xingyaoww added 2 commits July 11, 2024 22:25

update all integration tests

5473bf8

Merge branch 'main' into xw/further-improve-edit

1204fe1

xingyaoww enabled auto-merge (squash) July 11, 2024 14:26

xingyaoww mentioned this pull request Jul 11, 2024

#2220, integrated aider style linting, currently passes related o… #2489

Merged

xingyaoww merged commit 1b54800 into All-Hands-AI:main Jul 11, 2024
2 checks passed

xingyaoww changed the title ~~[Agent] Improve edits by adding back edit_file_by_line~~ [Agent] Improve edits by adding back append_file Jul 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Agent] Improve edits by adding back `append_file` #2722

[Agent] Improve edits by adding back `append_file` #2722

xingyaoww commented Jul 1, 2024 •

edited

Loading

xingyaoww commented Jul 7, 2024 •

edited

Loading

neubig commented Jul 7, 2024

xingyaoww commented Jul 10, 2024 •

edited

Loading

xingyaoww commented Jul 10, 2024

yufansong left a comment

tobitege Jul 11, 2024

xingyaoww Jul 11, 2024

tobitege Jul 11, 2024

tobitege commented Jul 11, 2024

[Agent] Improve edits by adding back append_file #2722

[Agent] Improve edits by adding back append_file #2722

Conversation

xingyaoww commented Jul 1, 2024 • edited Loading

xingyaoww commented Jul 7, 2024 • edited Loading

neubig commented Jul 7, 2024

xingyaoww commented Jul 10, 2024 • edited Loading

xingyaoww commented Jul 10, 2024

yufansong left a comment

Choose a reason for hiding this comment

tobitege Jul 11, 2024

Choose a reason for hiding this comment

xingyaoww Jul 11, 2024

Choose a reason for hiding this comment

tobitege Jul 11, 2024

Choose a reason for hiding this comment

tobitege commented Jul 11, 2024

[Agent] Improve edits by adding back `append_file` #2722

[Agent] Improve edits by adding back `append_file` #2722

xingyaoww commented Jul 1, 2024 •

edited

Loading

xingyaoww commented Jul 7, 2024 •

edited

Loading

xingyaoww commented Jul 10, 2024 •

edited

Loading