Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade openhands-aci to 0.1.7 #6123

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

ryanhoangt
Copy link
Contributor

End-user friendly description of the problem this fixes or functionality that this introduces

  • Include this change in the Release Notes. If checked, you must provide an end-user friendly description for your change below

Give a summary of what the PR does, explaining any non-trivial design decisions

This PR is to:

  • Upgrade openhands-aci to 0.1.7.

CC: @xingyaoww


Link of any specific issues this addresses

@xingyaoww xingyaoww added the run-eval-m Runs evaluation with 30 instances label Jan 7, 2025
@xingyaoww
Copy link
Collaborator

@mamoodi seems the eval job failed again :( any idea?

@enyst
Copy link
Collaborator

enyst commented Jan 7, 2025

Aha, I was just asking about eval https://github.com/All-Hands-AI/openhands-aci/pull/45/files#r1905796005

I'm curious if it shows anything.

@mamoodi
Copy link
Collaborator

mamoodi commented Jan 7, 2025

Hey team. Let me get back to this. You can't run on a fork unfortunately (permission access to secrets).
Have a few things going on but can run one soon.

@mamoodi
Copy link
Collaborator

mamoodi commented Jan 7, 2025

Triggered eval 30 instances.

@mamoodi
Copy link
Collaborator

mamoodi commented Jan 7, 2025

evaluation-up-aci.zip

@ryanhoangt
Copy link
Contributor Author

ryanhoangt commented Jan 8, 2025

When running via the UI I see the hidden count is shown, but looking into the output of view in the evaluation output of instance astropy__astropy-14995 not sure why it's not updated 🤔

openai__claude-3-5-sonnet-20241022-1736272164.5043015.json

Here's the repo at the base commit, which includes hidden dirs e.g. .github, .circleci, etc: https://github.com/swe-bench/astropy__astropy/tree/b16c7d12ccbc7b2d20364b89fb44285bcbfede54

@ryanhoangt
Copy link
Contributor Author

ryanhoangt commented Jan 8, 2025

@mamoodi in the evaluation job did we run a poetry install? When running eval locally I see the output is updated, while seems like it's not in the zip file from your run.

@enyst
Copy link
Collaborator

enyst commented Jan 8, 2025

@ryanhoangt Just a quick thought: I see here that the hidden message is added as a second element in the list, not a continuation of the first string. Is that necessary? Maybe it should be part of a single string, it seems like otherwise we lose it somewhere along the way where the code assumes there can be only one element (maybe in the agent?)

@ryanhoangt
Copy link
Contributor Author

@enyst Can you elaborate it a bit, maybe with an example? I'm not sure I'm understanding your concern 😅 Here's what the output looks like, which makes sense to me actually:

Here's the files and directories up to 2 levels deep in /workspace/astropy__astropy__5.2, excluding hidden items:
/workspace/astropy__astropy__5.2
/workspace/astropy__astropy__5.2/GOVERNANCE.md
/workspace/astropy__astropy__5.2/setup.py
/workspace/astropy__astropy__5.2/tox.ini
/workspace/astropy__astropy__5.2/CODE_OF_CONDUCT.md
/workspace/astropy__astropy__5.2/setup.cfg
/workspace/astropy__astropy__5.2/licenses
/workspace/astropy__astropy__5.2/licenses/PYTHON.rst
/workspace/astropy__astropy__5.2/licenses/PYFITS.rst
/workspace/astropy__astropy__5.2/licenses/NUMPY_LICENSE.rst
/workspace/astropy__astropy__5.2/licenses/CONFIGOBJ_LICENSE.rst
/workspace/astropy__astropy__5.2/licenses/JQUERY_LICENSE.rst
/workspace/astropy__astropy__5.2/licenses/PLY_LICENSE.rst
/workspace/astropy__astropy__5.2/licenses/EXPAT_LICENSE.rst
/workspace/astropy__astropy__5.2/licenses/README.rst
/workspace/astropy__astropy__5.2/licenses/AURA_LICENSE.rst
/workspace/astropy__astropy__5.2/licenses/ERFA.rst
/workspace/astropy__astropy__5.2/licenses/DATATABLES_LICENSE.rst
/workspace/astropy__astropy__5.2/licenses/GATSPY_LICENSE.rst
/workspace/astropy__astropy__5.2/licenses/WCSLIB_LICENSE.rst
/workspace/astropy__astropy__5.2/CHANGES.rst
/workspace/astropy__astropy__5.2/CITATION
/workspace/astropy__astropy__5.2/README.rst
/workspace/astropy__astropy__5.2/conftest.py
/workspace/astropy__astropy__5.2/cextern
/workspace/astropy__astropy__5.2/cextern/wcslib
/workspace/astropy__astropy__5.2/cextern/trim_expat.sh
/workspace/astropy__astropy__5.2/cextern/trim_cfitsio.sh
/workspace/astropy__astropy__5.2/cextern/cfitsio
/workspace/astropy__astropy__5.2/cextern/expat
/workspace/astropy__astropy__5.2/cextern/trim_wcslib.sh
/workspace/astropy__astropy__5.2/cextern/README.rst
/workspace/astropy__astropy__5.2/examples
/workspace/astropy__astropy__5.2/examples/README.rst
/workspace/astropy__astropy__5.2/examples/template
/workspace/astropy__astropy__5.2/examples/coordinates
/workspace/astropy__astropy__5.2/examples/io
/workspace/astropy__astropy__5.2/docs
/workspace/astropy__astropy__5.2/docs/lts_policy.rst
/workspace/astropy__astropy__5.2/docs/_static
/workspace/astropy__astropy__5.2/docs/cosmology
/workspace/astropy__astropy__5.2/docs/glossary.rst
/workspace/astropy__astropy__5.2/docs/convolution
/workspace/astropy__astropy__5.2/docs/importing_astropy.rst
/workspace/astropy__astropy__5.2/docs/_templates
/workspace/astropy__astropy__5.2/docs/units
/workspace/astropy__astropy__5.2/docs/nitpick-exceptions
/workspace/astropy__astropy__5.2/docs/uncertainty
/workspace/astropy__astropy__5.2/docs/robots.txt
/workspace/astropy__astropy__5.2/docs/credits.rst
/workspace/astropy__astropy__5.2/docs/wcs
/workspace/astropy__astropy__5.2/docs/logging.rst
/workspace/astropy__astropy__5.2/docs/time
/workspace/astropy__astropy__5.2/docs/install.rst
/workspace/astropy__astropy__5.2/docs/constants
/workspace/astropy__astropy__5.2/docs/whatsnew
/workspace/astropy__astropy__5.2/docs/rtd_environment.yaml
/workspace/astropy__astropy__5.2/docs/_pkgtemplate.rst
/workspace/astropy__astropy__5.2/docs/index.rst
/workspace/astropy__astropy__5.2/docs/modeling
/workspace/astropy__astropy__5.2/docs/stats
/workspace/astropy__astropy__5.2/docs/visualization
/workspace/astropy__astropy__5.2/docs/conftest.py
/workspace/astropy__astropy__5.2/docs/config
/workspace/astropy__astropy__5.2/docs/warnings.rst
/workspace/astropy__astropy__5.2/docs/table
/workspace/astropy__astropy__5.2/docs/known_issues.rst
/workspace/astropy__astropy__5.2/docs/changes
/workspace/astropy__astropy__5.2/docs/nddata
/workspace/astropy__astropy__5.2/docs/timeseries
/workspace/astropy__astropy__5.2/docs/development
/workspace/astropy__astropy__5.2/docs/samp
/workspace/astropy__astropy__5.2/docs/coordinates
/workspace/astropy__astropy__5.2/docs/changelog.rst
/workspace/astropy__astropy__5.2/docs/Makefile
/workspace/astropy__astropy__5.2/docs/make.bat
/workspace/astropy__astropy__5.2/docs/common_links.txt
/workspace/astropy__astropy__5.2/docs/license.rst
/workspace/astropy__astropy__5.2/docs/utils
/workspace/astropy__astropy__5.2/docs/conf.py
/workspace/astropy__astropy__5.2/docs/io
/workspace/astropy__astropy__5.2/CONTRIBUTING.md
/workspace/astropy__astropy__5.2/astropy
/workspace/astropy__astropy__5.2/astropy/cosmology
/workspace/astropy__astropy__5.2/astropy/__init__.py
/workspace/astropy__astropy__5.2/astropy/convolution
/workspace/astropy__astropy__5.2/astropy/compiler_version.cpython-39-x86_64-linux-gnu.so
/workspace/astropy__astropy__5.2/astropy/extern
/workspace/astropy__astropy__5.2/astropy/units
/workspace/astropy__astropy__5.2/astropy/uncertainty
/workspace/astropy__astropy__5.2/astropy/_version.py
/workspace/astropy__astropy__5.2/astropy/wcs
/workspace/astropy__astropy__5.2/astropy/time
/workspace/astropy__astropy__5.2/astropy/tests
/workspace/astropy__astropy__5.2/astropy/constants
/workspace/astropy__astropy__5.2/astropy/_compiler.c
/workspace/astropy__astropy__5.2/astropy/modeling
/workspace/astropy__astropy__5.2/astropy/stats
/workspace/astropy__astropy__5.2/astropy/version.py
/workspace/astropy__astropy__5.2/astropy/logger.py
/workspace/astropy__astropy__5.2/astropy/visualization
/workspace/astropy__astropy__5.2/astropy/CITATION
/workspace/astropy__astropy__5.2/astropy/conftest.py
/workspace/astropy__astropy__5.2/astropy/config
/workspace/astropy__astropy__5.2/astropy/table
/workspace/astropy__astropy__5.2/astropy/nddata
/workspace/astropy__astropy__5.2/astropy/timeseries
/workspace/astropy__astropy__5.2/astropy/samp
/workspace/astropy__astropy__5.2/astropy/coordinates
/workspace/astropy__astropy__5.2/astropy/_dev
/workspace/astropy__astropy__5.2/astropy/utils
/workspace/astropy__astropy__5.2/astropy/io
/workspace/astropy__astropy__5.2/codecov.yml
/workspace/astropy__astropy__5.2/astropy.egg-info
/workspace/astropy__astropy__5.2/astropy.egg-info/not-zip-safe
/workspace/astropy__astropy__5.2/astropy.egg-info/entry_points.txt
/workspace/astropy__astropy__5.2/astropy.egg-info/top_level.txt
/workspace/astropy__astropy__5.2/astropy.egg-info/requires.txt
/workspace/astropy__astropy__5.2/astropy.egg-info/PKG-INFO
/workspace/astropy__astropy__5.2/astropy.egg-info/SOURCES.txt
/workspace/astropy__astropy__5.2/astropy.egg-info/dependency_links.txt
/workspace/astropy__astropy__5.2/MANIFEST.in
/workspace/astropy__astropy__5.2/pyproject.toml
/workspace/astropy__astropy__5.2/LICENSE.rst


13 hidden files/directories in this directory are excluded. You can use 'ls -la /workspace/astropy__astropy__5.2' to see them.

@enyst
Copy link
Collaborator

enyst commented Jan 8, 2025

No worries, it's not a concern, it was just a guess as to why it may "lose" the second string along the way. Looks like it was a bad guess. It seems you found the actual issue!

@ryanhoangt
Copy link
Contributor Author

ryanhoangt commented Jan 9, 2025

Ran an eval on the 30 instances above locally, the result looks reasonable (baseline got 13/30). CC @xingyaoww

Screenshot 2025-01-09 at 16 02 19

@xingyaoww
Copy link
Collaborator

@ryanhoangt is this the result AFTER we fixed the ordering issue?

@ryanhoangt
Copy link
Contributor Author

ryanhoangt commented Jan 9, 2025

No, the ordering fix doesn't go into this release. This only contains your fix

@xingyaoww
Copy link
Collaborator

Can we bring in the ordering fix too? We can directly bump this to 0.1.8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-eval-m Runs evaluation with 30 instances
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants