Thursday, September 24, 2015

Using a system installed conda to manage python environments

Using a system installed conda

Motivation

I want to provide a common scipy stack across platforms, and possibly other python environments.   Anaconda provides binary packages that can be installed into a separate environment.  However, it is normally geared to be installed and managed by a user, while I want to be able to centrally manage the configurations.  With 4.1.6, I can basically use the conda upstream as is, with a couple minor modifications.  I have a PR filed here with my changes.

Installation

At the root is the conda package manager.  I want to be able to install this from rpms, so I created a conda COPR.  This provides common conda and conda-env rpms for Fedora and EPEL7.  I've made conda-activate optional as it installs /usr/bin/{activate,deactivate} which are very generic names, alhough it makes it much simpler to load the environments.

Configuration

The system conda reads /usr/.condarc as its config file.  This is an unfortunate location, but that's what the current code looks for.  I'd like to change this to /etc/condarc in the future.  The COPR conda package has as default:
envs_dirs:
 - /opt/anaconda/envs
 - ~/conda/envs
pkgs_dirs:
 - /var/cache/conda/pkgs

So we:
  • Point to our local envs, install them in /opt/anaconda/envs.
  • Put packages into /var/cache.  This requires a patch to conda - https://github.com/conda/conda/pull/1637

Locally, I also set channels to point to our InstantMirror cache by setting "channels" and "channel_alias".

Ansible


Configure everything and install the basic scipy env in ansible:

- name: Configure conda repo
  template: src=orion-conda.repo.j2 dest=/etc/yum.repos.d/orion-conda.repo

- name: Install conda
  package: name={{ item }} state=present
  with_items:
   - conda
   - conda-activate 
   - conda-env

- name: Configure conda
  copy: src=condarc dest=/usr/.condarc 
 
Then I have a conda_env.yml task to install and manage the environments:
 
- stat: path=/opt/anaconda/envs/{{ env_name }}
  register: conda_env_dir

- name: Create conda {{ env_name }} env
  command: /usr/bin/conda create -y -n {{ env_name }} {{ env_req }} {{ env_pkgs | join(" ") }}
  when: not (conda_env_dir.stat.isdir is defined and conda_env_dir.stat.isdir)

- name: Update conda {{ env_name }} env
  conda: name={{ item }} extra_args="-n {{ env_name }}" state=latest
  with_items: "{{ env_pkgs }}"
  when: conda_env_dir.stat.isdir is defined and conda_env_dir.stat.isdir

- name: Install conda/{{ env_name }} module
  copy: src=modulefile dest=/etc/modulefiles/conda/{{ env_name }}

Which is called like:
 
- include: conda_env.yml env_name={{ conda_env }} env_req={{ conda_envs[conda_env] }} env_pkgs={{ conda_pkgs }}
  with_items: "{{ conda_envs }}"
  loop_control:
    loop_var: conda_env
  tags:
  - conda

With defaults/main.yml defining the environments:

conda_envs:
  scipy: python=2
  scipy3: python=3
  scipy34: python=3.4

conda_pkgs:
- astropy
- basemap
- ipython-notebook
- jupyter
- matplotlib
- netcdf4
- pandas
- scikit-learn
- scipy
- seaborn

This uses the ansible conda module to manage the creeted conda environments.

Friday, September 4, 2015

Automatically (and hopefully securely) configure ansible-pull with a secure ssh git repository

    We are just starting to play around with using ansible to configure our systems.  Since we have a lot of laptops and other machines that shut themselves down when idle, we need to use ansible-pull to configure them.  I also use cobbler to provision our systems and wanted to be able to configure ansible-pull automatically as part of the install process.  The complicating factor is that since we do not want to have our playbooks public, we are using a ssh deployment key to get access to the git repository that ansible-pull will use.  So we needed a way to distribute the ansible private ssh key to the new systems.  Here is what I came up with:

* Create a ssh key pair for cobbler to use:

 
ssh-keygen -N '' -f ~/.ssh/id_rsa_cobbler


* Create a cobbler trigger to copy the ansible deployment key over to the newly installed system, in /var/lib/cobbler/triggers/install/post/ansible_key:

#!/bin/bash
[ "$1" = system ] &&
  /usr/bin/scp -i /root/.ssh/id_rsa_cobbler -o "StrictHostKeyChecking no" -p /root/.ssh/id_rsa_ansible ${2}:/root/.ssh/id_rsa_ansible

* In %post add the cobbler public key (id_rsa_cobbler.pub) to /root/.ssh/authorized_keys and only give it permission to scp to /root/.ssh/id_rsa_ansible:

cat >> /root/.ssh/authorized_keys <<EOF
command="scp -p -t /root/.ssh/id_rsa_ansible",no-pty,no-port-forwarding,no-X11-forwarding,no-agent-forwarding ssh-rsa AAAAB...==
EOF

* In %post, start up the sshd server so that cobbler can copy over the ssh key during the post install trigger:

/usr/sbin/sshd-keygen
/usr/sbin/sshd
 
* In %post, configure ansible-pull to run at each boot:

cat > /etc/systemd/system/ansible-pull.service <<EOF
[Unit]
Description=Run ansible-pull on boot
After=network-online.target
Wants=network-online.target
 
[Install]
WantedBy=multi-user.target
 
[Service]
Type=oneshot
ExecStart=/usr/bin/ansible-pull --url ssh://git@git.server.com/ansible-pull.git --key-file /root/.ssh/id_rsa_ansible
EOF
systemctl enable ansible-pull.service
echo localhost ansible_connection=local > /etc/ansible/inventory
 
* In %post, teach the machine about our git host:

echo [git.server.com]:51424,[10.10.10.10]:51424 ssh-rsa AAAA...== >> /root/.ssh/known_hosts

    This assumes we're using a local.yml playbook that has:

- hosts: localhost