[CI][Python][C++] Support on Power Architecture #43817

sandeepgupta12 · 2024-08-26T11:52:17Z

Describe the enhancement requested

Description:
We need to extend support for apache/arrow to the POWER/PPC64LE architecture.

Background:
• We have forked the apache/arrow repository and have successfully generated and tested wheels for both C++ and Python using a self-hosted CI runner on an OSU PPC64LE machine.
• The changes in the forked repository include following changes:
1. Added a job for PPC64LE in .github/workflows/cpp.yaml and .github/workflows/python.yaml, along with corresponding updates to docker-compose.yaml.
2. Created new Dockerfiles for C++ and Python: ci/docker/ppc64le-cpp.dockerfile and ci/docker/ppc64le-python.dockerfile.
3. Added build and test scripts for Python: ci/scripts/ppc64le_python_build.sh and ci/scripts/ppc64le_python_test.sh.

• We would like to upstream these changes to enable CI for ppc64le arch using GHA self-hosted runner.

Fork Information:
• Forked Repository: https://github.com/sandeepgupta12/arrow

Request:
• Support for PPC64LE: We are seeking support for the PPC64LE architecture for the apache/arrow project.
• Creation of OSU VM: To facilitate further testing and CI integration, we request the creation of an OSU VM configured for PPC64LE. Below are the details where you can create the OSU VM-
URL- https://osuosl.org/services/powerdev/request_hosting/
IBM Advocate- gerrit@us.ibm.com

Details:
The Open Source Lab (OSL) at Oregon State University (OSU), in partnership with IBM, provides access to IBM Power processor-based servers for developing and testing open source projects. The OSL offers following clusters:
OpenStack (non-GPU) Cluster:
• Architecture: Power little endian (LE) instances
• Virtualization: Kernel-based virtual machine (KVM)
• Access: Via Secure Shell (SSH) and/or through OpenStack's API and GUI interface
• Capabilities: Ideal for functional development and continuous integration (CI) work. It supports a managed Jenkins service hosted on the cluster or as a node incorporated into an external CI/CD pipeline.

Additional Information:
• We are prepared to provide any further details or assistance needed to support the PPC64LE architecture.
Please let us know if there are any specific requirements or steps needed to move forward with this request.

Component(s)

C++, Python

raulcd · 2024-08-26T12:59:59Z

Thanks for reviving this. Since we moved away from Travis we stopped testing with little endian.
I remember @kiszk discussing about using osuosl for s390x here: #35374 (comment)
I am concerned about the security implications on managing those boxes. Is this done by OSL? Are the VMs ephemeral or are they long living? Do we have to ask ASF infra (@assignUser?)

kiszk · 2024-08-26T13:06:13Z

Yes, I talkwd about OSL. But, I recently changed my idea to use GHA self-hosted runner since I saw this article.
https://community.ibm.com/community/user/powerdeveloper/blogs/gerrit-huizenga/2024/03/06/github-actions-runner-for-ibm-power-and-linuxone

assignUser · 2024-08-27T01:08:46Z

I agree with @raulcd, we can not support any non-ephemeral VM runners due to security reasons, they are a much to big risk in a public repo. This has been used to compromise major open-source repos before: https://www.legitsecurity.com/blog/github-pytorch-and-more-organizations-found-vulnerable-to-self-hosted-runner-attacks

I'd be happy to add power runners if they are ephemeral (-> vm get's destroyed after each job) which we currently have for arm runners using k8s: https://github.com/voltrondata-labs/gha-controller-infra

kiszk · 2024-08-27T05:45:12Z

@raulcd @assignUser Thank you for sharing useful information.

As far as I know, this self-hosted runner framework for ppc64le and s390x uses ephemeral VM.

assignUser · 2024-08-29T00:48:38Z

@kiszk No I don't think it is, the ephemeral there is talking about the image and how it needs to be build with the runner token to work, at least that's how I read it.

As the line where it starts the runner doesn't have any mechanism to kill the container and start a new one for a new job (as would be required for ephemeral runners). Which is what the controller is for, it starts a new container/runner for each job and removes the old one.

anup-kodlekere · 2024-09-04T11:28:54Z

@assignUser Hi! If ephemerality is the concern then we can set the config parameters to launch ephemeral LXD containers, that wouldn't be an issue. You would still need to follow the instructions in https://github.com/anup-kodlekere/gaplib, the only thing that changes is how the containers are deployed and managed. However, we haven't tested use-case before and would need to run some tests to ensure functional correctness. A simple systemd service running a python/bash script will act like a controller in this case, which will launch a clean LXD build environment (within the same VM host) for each new job.

kiszk · 2024-09-09T15:35:39Z

@anup-kodlekere Thanks, great to hear that. If changes in the instruction are prepared, I could try it for the arrow for s390x as a test.

pitrou · 2024-09-16T09:30:51Z

By the way, please add the Continuous Integration label to CI-related tasks, so that we can find them using a search :-)

sandeepgupta12 added the Type: enhancement label Aug 26, 2024

github-actions bot added Component: C++ Component: Python labels Aug 26, 2024

raulcd mentioned this issue Sep 16, 2024

[CI] Migrate jobs on Travis CI to dev/tasks/ #20496

Closed

pitrou added the Component: Continuous Integration label Sep 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI][Python][C++] Support on Power Architecture #43817

[CI][Python][C++] Support on Power Architecture #43817

sandeepgupta12 commented Aug 26, 2024

raulcd commented Aug 26, 2024

kiszk commented Aug 26, 2024

assignUser commented Aug 27, 2024

kiszk commented Aug 27, 2024

assignUser commented Aug 29, 2024

anup-kodlekere commented Sep 4, 2024

kiszk commented Sep 9, 2024

pitrou commented Sep 16, 2024 •

edited

Loading

[CI][Python][C++] Support on Power Architecture #43817

[CI][Python][C++] Support on Power Architecture #43817

Comments

sandeepgupta12 commented Aug 26, 2024

Describe the enhancement requested

Component(s)

raulcd commented Aug 26, 2024

kiszk commented Aug 26, 2024

assignUser commented Aug 27, 2024

kiszk commented Aug 27, 2024

assignUser commented Aug 29, 2024

anup-kodlekere commented Sep 4, 2024

kiszk commented Sep 9, 2024

pitrou commented Sep 16, 2024 • edited Loading

pitrou commented Sep 16, 2024 •

edited

Loading