Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use LLM to analyze ML-Bench failure cases #2399

Merged
merged 96 commits into from
Jun 13, 2024
Merged
Changes from 1 commit
Commits
Show all changes
96 commits
Select commit Hold shift + click to select a range
a0bdeae
add ml-bench w/o exec env
May 23, 2024
efd0bc5
fix
May 23, 2024
5729199
fix
super-dainiu May 23, 2024
55c60d2
Merge branch 'main' of https://github.com/super-dainiu/OpenDevin
super-dainiu May 23, 2024
01f81e1
fix typos (#1956)
RainRat May 21, 2024
a42dcdb
Refactored Logs (#1939)
SmartManoj May 21, 2024
01eb1b8
[Feat] A competitive Web Browsing agent (#1856)
frankxu2004 May 21, 2024
64c500f
Update README.md SWE-bench score (#1959)
neubig May 22, 2024
7f3b4b7
fix: llm is_local function logic error (#1961)
Shimada666 May 22, 2024
638da19
doc: update documentation about poetry update (#1962)
yufansong May 22, 2024
79e51f3
feat: add metrics related to cost for better observability (#1944)
yufansong May 22, 2024
4bcacc3
doc: add more cmd in unit test documentation (#1963)
yufansong May 22, 2024
e8fb4dd
--- (#1975)
dependabot[bot] May 22, 2024
58ff37e
--- (#1976)
dependabot[bot] May 22, 2024
dbf84e8
Logging security (#1943)
enyst May 22, 2024
aa0157e
--- (#1967)
dependabot[bot] May 22, 2024
621babd
--- (#1968)
dependabot[bot] May 22, 2024
181f52f
--- (#1969)
dependabot[bot] May 22, 2024
6843912
--- (#1970)
dependabot[bot] May 22, 2024
d64580d
--- (#1971)
dependabot[bot] May 22, 2024
7f634bc
Refactor session management (#1810)
rbren May 22, 2024
fa80ebe
fix #1960 (#1964)
zeul22 May 23, 2024
a51a0fc
Add ruff for shared mutable defaults (B) (#1938)
enyst May 23, 2024
dba1e53
Refactor integration testing CI, add optional Mac tests, and mark a f…
li-boxuan May 23, 2024
9f6eed4
Fix Repeated Responses in Chat by Adding IPythonRunCellObservation (#…
jiangleo May 23, 2024
a08f8c0
Save CI cycles for backend tests (#1985)
li-boxuan May 23, 2024
56eb1a7
Fix typo in prompt (#1992)
jeremi May 23, 2024
128232e
Refactor monologue and SWE agent to use the messages in state history…
enyst May 23, 2024
5b0af10
fix: catch session file not existed exception when init EventStream(m…
assertion May 23, 2024
9f89abc
add ml-bench in readme
super-dainiu May 23, 2024
3c29635
Bump boto3 from 1.34.110 to 1.34.111 (#2001)
dependabot[bot] May 23, 2024
420eeb8
Bump docker from 7.0.0 to 7.1.0 (#2002)
dependabot[bot] May 23, 2024
9d8d050
Bump litellm from 1.37.20 to 1.38.0 (#2005)
dependabot[bot] May 23, 2024
862f96e
Fix SWE-Bench evaluation due to setuptools version (#1995)
xingyaoww May 23, 2024
61d7e9a
fix session state after resuming (#1999)
rbren May 23, 2024
31f2c4a
Implement `agentskills` for OpenDevin to helpfully improve edit AND i…
xingyaoww May 23, 2024
8ca0938
build: Add poetry command to use Python 3.11 for environment setup (#…
DaxServer May 23, 2024
51f958d
Bump @react-types/shared from 3.23.0 to 3.23.1 in /frontend (#2006)
dependabot[bot] May 23, 2024
11d7d53
Bump @types/react-syntax-highlighter in /frontend (#2007)
dependabot[bot] May 23, 2024
e769cd3
Bump @typescript-eslint/parser from 7.9.0 to 7.10.0 in /frontend (#2008)
dependabot[bot] May 23, 2024
b6028d3
Bump lint-staged from 15.2.2 to 15.2.4 in /frontend (#2009)
dependabot[bot] May 23, 2024
fc73bf8
Merge branch 'main' of https://github.com/super-dainiu/OpenDevin
super-dainiu May 23, 2024
9ccb4c9
Merge branch 'main' into main
super-dainiu May 24, 2024
27ff0ac
Update README.md
super-dainiu May 24, 2024
be9a768
Update README.md
super-dainiu May 24, 2024
e182edc
add run_infer.sh
super-dainiu May 24, 2024
5485854
fix input output
super-dainiu May 24, 2024
0fb6e47
Merge branch 'main' into main
super-dainiu May 24, 2024
2bbeafe
fix docker sandbox
super-dainiu May 24, 2024
de9da69
fix run
super-dainiu May 24, 2024
982ddd1
update and clean run_infer.py
super-dainiu May 26, 2024
3dff9e0
add script to clean up dockers
super-dainiu May 26, 2024
c5a153c
update repo uid
super-dainiu May 26, 2024
b90d999
add description
super-dainiu May 26, 2024
7aa4b05
new
super-dainiu May 26, 2024
dd8ee90
Merge branch 'main' into main
super-dainiu May 26, 2024
b5be5bd
Update README.md
tangxiangru May 26, 2024
88bd7a4
use root for sandbox
super-dainiu May 26, 2024
00beb8f
Merge branch 'main' of https://github.com/super-dainiu/OpenDevin into…
super-dainiu May 26, 2024
73fb6f7
Merge branch 'main' into main
super-dainiu May 26, 2024
2b040c9
update readme
super-dainiu May 26, 2024
27bebde
Merge branch 'main' of https://github.com/super-dainiu/OpenDevin into…
super-dainiu May 26, 2024
bb1732b
Merge branch 'main' into main
super-dainiu May 26, 2024
a9fe980
Merge branch 'OpenDevin:main' into main
super-dainiu May 28, 2024
4966d6b
Merge branch 'main' into main
super-dainiu May 29, 2024
a630ea7
Merge branch 'main' into main
super-dainiu Jun 2, 2024
acd2eb1
update ml-bench conda env
super-dainiu Jun 2, 2024
3cdcb09
update readme
super-dainiu Jun 2, 2024
cc9237c
update readme
super-dainiu Jun 2, 2024
75cb16f
Merge branch 'OpenDevin:main' into main
super-dainiu Jun 3, 2024
785c0cc
use try except
super-dainiu Jun 3, 2024
3251f34
modify raise exception
super-dainiu Jun 3, 2024
c0fe06c
add int
super-dainiu Jun 3, 2024
fdc8a7a
update README
super-dainiu Jun 3, 2024
e278e8c
longer time
super-dainiu Jun 3, 2024
6a47e00
fix existing issues
super-dainiu Jun 4, 2024
1e63622
Merge branch 'main' into main
super-dainiu Jun 4, 2024
03be34a
fix existing issue
super-dainiu Jun 4, 2024
4f2c91f
Merge branch 'main' of https://github.com/super-dainiu/OpenDevin into…
super-dainiu Jun 4, 2024
f89b293
new docker image
super-dainiu Jun 4, 2024
b13bdfd
add metrics of cost
super-dainiu Jun 4, 2024
9fa1c5b
add result parsing cost
super-dainiu Jun 4, 2024
e5d879f
fix
super-dainiu Jun 4, 2024
f287a87
Merge branch 'main' into main
super-dainiu Jun 4, 2024
f2b9d7c
fix
super-dainiu Jun 5, 2024
2049365
update summarize
super-dainiu Jun 5, 2024
4137a73
fix
super-dainiu Jun 5, 2024
b9c89f8
fix continued inference
super-dainiu Jun 5, 2024
9057460
add analyze
super-dainiu Jun 11, 2024
54b4ec8
Merge branch 'OpenDevin:main' into main
super-dainiu Jun 11, 2024
eff2244
Merge branch 'main' of https://github.com/super-dainiu/OpenDevin into…
super-dainiu Jun 11, 2024
624737b
update readme
super-dainiu Jun 11, 2024
561f31f
use 4o
super-dainiu Jun 11, 2024
1cf2e39
Merge branch 'main' into main
super-dainiu Jun 12, 2024
805e856
add eval output
super-dainiu Jun 12, 2024
8289014
Merge branch 'main' of https://github.com/super-dainiu/OpenDevin into…
super-dainiu Jun 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
build: Add poetry command to use Python 3.11 for environment setup (#…
  • Loading branch information
DaxServer authored and super-dainiu committed May 23, 2024
commit 8ca09388dc65d04a6c4f0eac356c55dfd0102622
3 changes: 2 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -130,8 +130,9 @@ pull-docker-image:

install-python-dependencies:
@echo "$(GREEN)Installing Python dependencies...$(RESET)"
poetry env use python3.11
@if [ "$(shell uname)" = "Darwin" ]; then \
echo "$(BLUE)Installing `chroma-hnswlib`...$(RESET)"; \
echo "$(BLUE)Installing chroma-hnswlib...$(RESET)"; \
export HNSWLIB_NO_NATIVE=1; \
poetry run pip install chroma-hnswlib; \
fi
Expand Down