Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI][Python][C++] Support on Power Architecture #43817

Open
sandeepgupta12 opened this issue Aug 26, 2024 · 8 comments
Open

[CI][Python][C++] Support on Power Architecture #43817

sandeepgupta12 opened this issue Aug 26, 2024 · 8 comments

Comments

@sandeepgupta12
Copy link

Describe the enhancement requested

Description:
We need to extend support for apache/arrow to the POWER/PPC64LE architecture.

Background:
• We have forked the apache/arrow repository and have successfully generated and tested wheels for both C++ and Python using a self-hosted CI runner on an OSU PPC64LE machine.
• The changes in the forked repository include following changes:
1. Added a job for PPC64LE in .github/workflows/cpp.yaml and .github/workflows/python.yaml, along with corresponding updates to docker-compose.yaml.
2. Created new Dockerfiles for C++ and Python: ci/docker/ppc64le-cpp.dockerfile and ci/docker/ppc64le-python.dockerfile.
3. Added build and test scripts for Python: ci/scripts/ppc64le_python_build.sh and ci/scripts/ppc64le_python_test.sh.

• We would like to upstream these changes to enable CI for ppc64le arch using GHA self-hosted runner.

Fork Information:
• Forked Repository: https://github.com/sandeepgupta12/arrow

Request:
• Support for PPC64LE: We are seeking support for the PPC64LE architecture for the apache/arrow project.
• Creation of OSU VM: To facilitate further testing and CI integration, we request the creation of an OSU VM configured for PPC64LE. Below are the details where you can create the OSU VM-
URL- https://osuosl.org/services/powerdev/request_hosting/
IBM Advocate- gerrit@us.ibm.com

Details:
The Open Source Lab (OSL) at Oregon State University (OSU), in partnership with IBM, provides access to IBM Power processor-based servers for developing and testing open source projects. The OSL offers following clusters:
OpenStack (non-GPU) Cluster:
Architecture: Power little endian (LE) instances
Virtualization: Kernel-based virtual machine (KVM)
Access: Via Secure Shell (SSH) and/or through OpenStack's API and GUI interface
Capabilities: Ideal for functional development and continuous integration (CI) work. It supports a managed Jenkins service hosted on the cluster or as a node incorporated into an external CI/CD pipeline.

Additional Information:
• We are prepared to provide any further details or assistance needed to support the PPC64LE architecture.
Please let us know if there are any specific requirements or steps needed to move forward with this request.

Component(s)

C++, Python

@raulcd
Copy link
Member

raulcd commented Aug 26, 2024

Thanks for reviving this. Since we moved away from Travis we stopped testing with little endian.
I remember @kiszk discussing about using osuosl for s390x here: #35374 (comment)
I am concerned about the security implications on managing those boxes. Is this done by OSL? Are the VMs ephemeral or are they long living? Do we have to ask ASF infra (@assignUser?)

@kiszk
Copy link
Member

kiszk commented Aug 26, 2024

Yes, I talkwd about OSL. But, I recently changed my idea to use GHA self-hosted runner since I saw this article.
https://community.ibm.com/community/user/powerdeveloper/blogs/gerrit-huizenga/2024/03/06/github-actions-runner-for-ibm-power-and-linuxone

@assignUser
Copy link
Member

I agree with @raulcd, we can not support any non-ephemeral VM runners due to security reasons, they are a much to big risk in a public repo. This has been used to compromise major open-source repos before: https://www.legitsecurity.com/blog/github-pytorch-and-more-organizations-found-vulnerable-to-self-hosted-runner-attacks

I'd be happy to add power runners if they are ephemeral (-> vm get's destroyed after each job) which we currently have for arm runners using k8s: https://github.com/voltrondata-labs/gha-controller-infra

@kiszk
Copy link
Member

kiszk commented Aug 27, 2024

@raulcd @assignUser Thank you for sharing useful information.

As far as I know, this self-hosted runner framework for ppc64le and s390x uses ephemeral VM.

@assignUser
Copy link
Member

@kiszk No I don't think it is, the ephemeral there is talking about the image and how it needs to be build with the runner token to work, at least that's how I read it.

As the line where it starts the runner doesn't have any mechanism to kill the container and start a new one for a new job (as would be required for ephemeral runners). Which is what the controller is for, it starts a new container/runner for each job and removes the old one.

@anup-kodlekere
Copy link

@assignUser Hi! If ephemerality is the concern then we can set the config parameters to launch ephemeral LXD containers, that wouldn't be an issue. You would still need to follow the instructions in https://github.com/anup-kodlekere/gaplib, the only thing that changes is how the containers are deployed and managed. However, we haven't tested use-case before and would need to run some tests to ensure functional correctness. A simple systemd service running a python/bash script will act like a controller in this case, which will launch a clean LXD build environment (within the same VM host) for each new job.

@kiszk
Copy link
Member

kiszk commented Sep 9, 2024

@anup-kodlekere Thanks, great to hear that. If changes in the instruction are prepared, I could try it for the arrow for s390x as a test.

@pitrou
Copy link
Member

pitrou commented Sep 16, 2024

By the way, please add the Continuous Integration label to CI-related tasks, so that we can find them using a search :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants
@raulcd @kiszk @pitrou @assignUser @anup-kodlekere @sandeepgupta12 and others