This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
linux:virtualization:lxc [2016/10/02 00:37] tkilla [Unprivileged containers] |
linux:virtualization:lxc [2022/01/13 23:08] (current) tkilla [Unprivileged containers] |
||
---|---|---|---|
Line 86: | Line 86: | ||
**reload:** | **reload:** | ||
sysctl -p / | sysctl -p / | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | \\ | ||
+ | ==== Simple Nat Bridge ==== | ||
+ | |||
+ | Easy version without libvirt - work well at OVH and hetzner. | ||
+ | |||
+ | **Add an additional bridge (keep eth0 as is) in / | ||
+ | |||
+ | auto lxc-bridge | ||
+ | iface lxc-bridge inet static | ||
+ | bridge_ports none | ||
+ | bridge_fd 0 | ||
+ | bridge_maxwait 0 | ||
+ | address 192.168.10.1 | ||
+ | netmask 255.255.255.0 | ||
+ | up iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE | ||
+ | |||
+ | |||
+ | **activate forwarding temporary: | ||
+ | echo 1 > / | ||
+ | |||
+ | **activate forwarding permanent: | ||
+ | |||
+ | Uncomment in / | ||
+ | net.ipv4.ip_forward=1 | ||
+ | |||
+ | Activate new settings: | ||
+ | sysctl -p | ||
+ | |||
+ | |||
+ | **firewall rules** | ||
+ | |||
+ | # intern -> extern | ||
+ | iptables -t nat -A POSTROUTING -s 192.168.10.10/ | ||
+ | |||
+ | # ports extern -> intern - 1 rule for each $PORT | ||
+ | iptables -t nat -A PREROUTING | ||
+ | |||
+ | |||
+ | **Container config:** | ||
+ | lxc.network.type = veth | ||
+ | lxc.network.flags = up | ||
+ | lxc.network.link = lxc-bridge | ||
+ | lxc.network.ipv4.gateway = 192.168.10.1 | ||
+ | lxc.network.ipv4 = 192.168.10.10/ | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | * https:// | ||
+ | |||
\\ | \\ | ||
==== Nat Bridge via libvirt and IPV6 ==== | ==== Nat Bridge via libvirt and IPV6 ==== | ||
+ | |||
+ | **Deprecated! - if you not need DHCP, use the simple bridge method without libvirt/ | ||
apt-get install libvirt-bin | apt-get install libvirt-bin | ||
Line 119: | Line 176: | ||
</ip> | </ip> | ||
<!-- IPV6 :: --> | <!-- IPV6 :: --> | ||
- | < | + | < |
</ | </ | ||
</ | </ | ||
Line 128: | Line 185: | ||
lxc.network.flags = up | lxc.network.flags = up | ||
lxc.network.link = virbr0 | lxc.network.link = virbr0 | ||
- | lxc.network.hwaddr = CC: | + | |
lxc.network.ipv4 = 192.168.122.100/ | lxc.network.ipv4 = 192.168.122.100/ | ||
lxc.network.ipv4.gateway = auto # auto usually works, otherwise set main IP gateway (.254 at OVH) | lxc.network.ipv4.gateway = auto # auto usually works, otherwise set main IP gateway (.254 at OVH) | ||
# ipv6: | # ipv6: | ||
- | lxc.network.ipv6 = 2001:41d0:0002:bb10: | + | lxc.network.ipv6 = 1234:1234:1234:1234:0100/64 |
lxc.network.ipv6.gateway = auto | lxc.network.ipv6.gateway = auto | ||
Line 151: | Line 208: | ||
## IPv6: | ## IPv6: | ||
iface eth0 inet6 static | iface eth0 inet6 static | ||
- | address | + | address |
netmask 64 | netmask 64 | ||
Line 167: | Line 224: | ||
virsh net-destroy default | virsh net-destroy default | ||
virsh net-start default | virsh net-start default | ||
+ | |||
+ | **if you remove this net, disable autostart** | ||
+ | virsh net-autostart default --disable | ||
+ | |||
**activate forwarding temporary: | **activate forwarding temporary: | ||
Line 175: | Line 236: | ||
Uncomment in / | Uncomment in / | ||
net.ipv4.ip_forward=1 | net.ipv4.ip_forward=1 | ||
+ | |||
+ | Activate new settings: | ||
+ | sysctl -p | ||
**iptables config ( 1.2.3.4 is pubilc ip in root):** | **iptables config ( 1.2.3.4 is pubilc ip in root):** | ||
Line 200: | Line 264: | ||
echo "IPv6 setup FOREACH vserver.." | echo "IPv6 setup FOREACH vserver.." | ||
- | ip -6 route add 2001:41d0:2:bb10::100 dev virbr0 | + | ip -6 route add 1234:1234:1234:1234::100 dev virbr0 |
- | ip -6 neigh add proxy 2001:41d0:2:bb10::100 dev eth0 | + | ip -6 neigh add proxy 1234:1234:1234:1234::100 dev eth0 |
- | ping6 -I virbr0 -c 5 2001:41d0:2:bb10::100 | + | ping6 -I virbr0 -c 5 1234:1234:1234:1234::100 |
\\ | \\ | ||
Line 280: | Line 344: | ||
http:// | http:// | ||
- | | + | FIX: original keyserver is broken! add: --keyserver hkp:// |
+ | |||
+ | |||
+ | | ||
or | or | ||
- | lxc-create -n websrv -t debian-wheezy | + | lxc-create -n websrv -t debian-wheezy |
Start / Stop VS: | Start / Stop VS: | ||
Line 292: | Line 359: | ||
Enter VS: | Enter VS: | ||
lxc-console -n websrv | lxc-console -n websrv | ||
+ | |||
+ | |||
+ | In Buster, use the lxc-download script: | ||
+ | |||
+ | / | ||
+ | lxc-create -t / | ||
Line 302: | Line 375: | ||
lxc-clone --backingstore btrfs --orig vs1 --new vs2 --snapshot | lxc-clone --backingstore btrfs --orig vs1 --new vs2 --snapshot | ||
+ | \\ | ||
+ | ===== Mount external Dirs in Container ===== | ||
+ | |||
+ | The recommended way is to add the mountpoint with a relative path in the VS config: | ||
+ | |||
+ | lxc.mount.entry=/ | ||
+ | |||
+ | |||
+ | Under some cicumstances it does not work (in unprivileged containers), | ||
+ | |||
+ | lxc.mount.entry = /home/test / | ||
+ | |||
+ | Also check Permissions and Ownership. chown to the root ID inside the container. | ||
\\ | \\ | ||
===== brtfs snapshots ===== | ===== brtfs snapshots ===== | ||
+ | |||
+ | the container must be stopped for a lxc-snapshot. use btrfs snapshot to backup running containers (mysql may get inconsitent) | ||
you need to create container with option | you need to create container with option | ||
lxc-create -B btrfs -n mycontainer -t ubuntu | lxc-create -B btrfs -n mycontainer -t ubuntu | ||
+ | |||
+ | |||
Line 314: | Line 404: | ||
mv / | mv / | ||
- | btrfs subvolume create / | + | btrfs subvolume create / |
btrfs subvolume list / | btrfs subvolume list / | ||
- | lxc-snapshot -n webdev | ||
| | ||
# for unprivileged root container, check UID and GID of rootfs dir (here it is 100000): | # for unprivileged root container, check UID and GID of rootfs dir (here it is 100000): | ||
chown 100000: | chown 100000: | ||
+ | mv / | ||
+ | lxc-snapshot -n webdev | ||
+ | |||
+ | snapshot with comment | ||
+ | |||
+ | echo " | ||
+ | lxc-stop -n my-lxc-container | ||
+ | lxc-snapshot -n my-lxc-container -c snap-comment | ||
+ | rm snap-comment | ||
* https:// | * https:// | ||
Line 342: | Line 440: | ||
==== Unprivileged containers ==== | ==== Unprivileged containers ==== | ||
+ | |||
+ | uids and gids are shifted to another scope. so root uid 0 becomes 100000 for example. inside the container this is not visible, but from outside you can see the uid 100000+. you can still run these containers as root. you only have to add root in /etc/subuid and /etc/subgid - than its the same as running the containers as user. | ||
+ | |||
+ | for best security, each container should have its own uid/gid space, although it is unlikely to break out of one container and enter another. | ||
+ | |||
only available in > v1.0, not in debian squeeze :( | only available in > v1.0, not in debian squeeze :( | ||
- | run unprivileged container as root: | + | **run unprivileged container as root:** |
add root to /etc/subuid and / | add root to /etc/subuid and / | ||
root: | root: | ||
- | vs config - map user ids: | + | |
+ | **vs config - map user ids:** | ||
put this in / | put this in / | ||
Line 357: | Line 461: | ||
lxc.id_map = g 0 100000 65536 | lxc.id_map = g 0 100000 65536 | ||
- | script to shift uuids to another span: | + | in buster it's called idmap: |
+ | lxc.idmap = u 0 100000 65536 | ||
+ | lxc.idmap = g 0 100000 65536 | ||
+ | |||
+ | **shift uuids to another span:** | ||
+ | |||
+ | Use this script: https:// | ||
+ | |||
+ | / | ||
- | for i in `seq 0 65535`; do | ||
- | find ${LXC_BASEDIR}${1}/ | ||
- | find ${LXC_BASEDIR}${1}/ | ||
- | done | ||
- | {{: | ||
create container - use download method for unprivileged. jessie is not available, so you can upgrade wheezy and fix systemd error :( | create container - use download method for unprivileged. jessie is not available, so you can upgrade wheezy and fix systemd error :( | ||
- | | + | FIX for download: Original keyserver is broken, add --keyserver hkp:// |
+ | |||
+ | | ||
# error no jessie: | # error no jessie: | ||
- | lxc-create -B btrfs -n websrv -t download -- -d debian -r jessie -a amd64 | + | lxc-create -B btrfs -n websrv -t download -- -d debian -r jessie -a amd64 --keyserver hkp:// |
# error not working with unprivileged | # error not working with unprivileged | ||
- | LANG=C SUITE=jessie MIRROR=http:// | + | LANG=C SUITE=jessie MIRROR=http:// |
- | ==== Bugfixes ==== | + | **Unprivileged related bugfixes** |
lxc-start: Permission denied - failed to create directory '/ | lxc-start: Permission denied - failed to create directory '/ | ||
Line 385: | Line 494: | ||
chown 100000: | chown 100000: | ||
+ | |||
+ | If you copy files from outside into the container, they have wrong uid/gid. if the file should belong to root, just run this from the root system: | ||
+ | |||
+ | chown 100000: | ||
+ | |||
Line 392: | Line 506: | ||
* https:// | * https:// | ||
* https:// | * https:// | ||
+ | * https:// | ||
Line 447: | Line 562: | ||
* https:// | * https:// | ||
+ | |||
+ | ** systemd cgroups fuckup** | ||
+ | |||
+ | Could not find writable mount point for cgroup hierarchy 12 while trying to create cgroup | ||
+ | |||
+ | 12 is a systemd hierarchy - if you remove systemd and switch to sysvinit-core, | ||
+ | |||
+ | FIXME: | ||
+ | |||
+ | check all of systemd is gone (uninstall ii): | ||
+ | dpkg -l *systemd* | ||
+ | apt remove --purge *systemd* | ||
+ | | ||
+ | / | ||
+ | |||
+ | session | ||
+ | |||
+ | Check, if 12 is still active: | ||
+ | |||
+ | cat / | ||
+ | |||
+ | WORKAROUND: | ||
+ | mcedit / | ||
+ | lxc.cgroup.use = @all | ||
+ | |||
+ | * this is helpful: https:// | ||
+ | * this is not: https:// | ||
+ | |||
+ | \\ | ||
**SSH Config** | **SSH Config** | ||
Line 463: | Line 607: | ||
* http:// | * http:// | ||
+ | |||
+ | or run this inside the container: | ||
+ | |||
+ | sed '/ | ||
FIXME maybe insecure | FIXME maybe insecure | ||
Line 475: | Line 623: | ||
+ | **rsyslog error** | ||
+ | |||
+ | TESTME | ||
+ | |||
+ | rsyslog doesnt start on boot and errors in syslog: | ||
+ | .. rsyslogd: imklog: cannot open kernel log(/ | ||
+ | .. rsyslogd-2145: | ||
+ | |||
+ | Disable kernel logging in container / | ||
+ | |||
+ | # $ModLoad imklog | ||
+ | |||
+ | |||
+ | |||
+ | \\ | ||
+ | ===== Move a root system into container ===== | ||
+ | |||
+ | you can disable many services: | ||
+ | * udev: udev service (which is a hard dependency of systemd in Jessie) won't run in a container, but systemd recognized it | ||
+ | * apparmor: mounts security-fs, | ||
+ | * kmod | ||
+ | * lm-sensors | ||
+ | * dbus | ||
+ | * kbd | ||
+ | * hdparm | ||
+ | * ... | ||
+ | |||
+ | configuration changes | ||
+ | |||
+ | * various sysctl.conf options do not work inside container | ||
+ | * / | ||
+ | |||
+ | If you get errors like: | ||
+ | INIT: Id " | ||
+ | |||
+ | disable the matching line in / | ||
+ | # 5: | ||