twitch video uploader

So, I figured I’d make ONE MORE post today before I get back to doing other things that are likely much higher importance (doing dishes, eating dinner, and working on work code to name a few). While I was doing a bit of gaming a few weeks ago, I decided NOT to stream on twitch while making attempts to defeat Bahamut in FFVII Remake. When I actually succeeded (much to my surprise, because that fight is quite difficult), I realized quickly that I couldn’t retroactively post the saved footage to twitch since I wasn’t broadcasting, but I COULD upload to youtube.

I then thought: “well, I should be able to download this video or import the video from youtube to twitch, right?” Well, it turns out, it’s not quite that easy. I decided (though, I still haven’t implemented this just yet), that I should create a simple video uploader for twitch (they DO have an API to allow uploading videos to the platform). However, it appears that there is no built-in tool to things like OBS to upload a video after it’s created. So, I’m going to work on a simple command-line tool to upload a video to twitch (of course, if someone else knows how you can do this without the tool, I guess I wouldn’t need to build a tool to do that work).

The design I’m going to use is something like this:

Pull auth token into the application
calculate the parts needed to break the video into using the following logic:

partsCount = video_file_size / 25MB

set headers and post using net library:
response = curl -H 'Accept: application/vnd.twitchtv.v5+json' \
-H 'Authorization: OAuth {authToken}' \
-H 'Client-ID: {clientId}' \
-X POST 'https://api.twitch.tv/kraken/videos?channel_id={chanelNumber}&title={videoFileTitle}'

loop over content based on the number of parts of the video that exist:
curl -H 'Content-Length: 26214400' \
--data-binary "@{videoName.extension}" \
-X PUT 'https://uploads.twitch.tv/upload/{videoId}?part={partNumber}&upload_token={uploadToken}'

complete upload:
curl -X POST 'https://uploads.twitch.tv/upload/{videoId}/complete?upload_token={uploadToken}'

This was all found using the twitch video upload guide, which is deprecated (apiv5 is deprecated). There are likely some more up-to-date instructions, but I haven’t found them just yet.

In the event that this is the correct instructions for the most up-to-date method of uploading videos, I at least have a record here to come back to when I get the chance to work on this project.

Twitch

Hey everyone, I thought I’d also post about my gaming/streaming. I would recommend checking out my twitch channel and hit subscribe so you can watch me swear at my computer screen or watch me fail repeatedly at hard video games (like FFVII Remake Hard Mode).

I used to do a lot of streaming, but I fell off for several years. I’m back and trying to be more regular about it, but my real job comes before my fun one (also, I don’t make any money at twitch or other videos, it’s purely for fun). Go check it out here (this is a random video example from my channel):

NAS 2.0

I mentioned in my previous post that I have been working on my NAS. I thought I would share the current state as well as what my plans are for my NAS going forward; forward into NAS 2.0.

So, what exactly was the state of my NAS before? Well, I had gotten a hot swap 16-bay case from NORCO on Amazon. I filled it up with 16 SATA 3TB drives (some 4TB as I plan to move to more capacity soon-ish) that also had 2x LSI SAS 9207 8-i cards. These served quite well, but the NAS needed more. I had already used up over half of the storage (I have a lot of movies from my DVD/Blu-Ray collection added there), so I needed to expand.

In comes the 45 drives chassis that has been touted by folks like Linus Tech Tips and Backblaze. Though, I really am not a fan of intel and I like to build my own machines, with a chassis like the Storinator S45, I could get just the chassis (worked with a cool sales guy over there named Dylan), a few mini SAS controllers, and then be off to the races.

Storinator S45
My new Storinator S45 chassis to hold all the Terabytes.

I realized quickly, however, that even though I was adding in considerable additional bandwidth and storage, I wasn’t going to be able to optimize that pathway as I was being bound by limitations of the very old CPU and motherboard. I am still running an old Athlon FX 8320 processor in this as well as an old motherboard that only supports PCIe 2.0. So, I’m missing out on bandwidth that is being shared by: 2x PCIe 3.0 mini-SAS controllers AND an intel 82599 SFP+ nic from 10G. As a result, there’s a lot of bandwidth sharing on the PCIe lanes, which results in bottlenecked performance.

I’ve set out to upgrade this system to be a proper server with proper ECC ram to match the rest of my rack. So, I started parting out what I want:
* AMD Epyc 7282 16C, 32T CPU
* Asrock EPYCD8-2T SP3
* Some DDR4 ECC Ram (which I haven’t settled on yet)
* Some Cooler for the CPU since it doesn’t come with a cooler

I’ve bought the first two items on the list and the CPU is already here. I’m just waiting on the motherboard, figuring out what RAM I want and how much ram, and then the CPU cooler setup. I haven’t settled on those because I’m breaking the project up over several months to stay within my monthly “fun budget”. However, when this is done, I’ll have a NAS that should be more than capable of delivering the full bandwidth and response times of a very fast NAS with: 10Gbps NIC, 3x MiniSAS 16-port controllers and 45 total drives @3TB per drive (again, I’ll be wasting 1TB for several drives until I finish all 45 drives AND start to swap the 3TB drives out for 4TB drives).

Once I do the replacement of the parts in the chassis, I’ll post a “finished” picture to share with everyone. Also, so that everyone knows, I have 18 drives filled now @3TB each. Since my NAS is configured in RAID 10, That means I have 24TB of usable storage space and I’ll also be growing that every month by buying a new drive every month until I fill all 30 slots (the two miniSAS cards only use 15/16 ports each and I don’t have the third card yet). After I fill all 30 slots, I’ll upgrade the drives, then I’ll add the third SAS card and add more drives (assuming the price of 8TB drives has not dropped dramatically in that timeframe).

Until next time!

…It’s been a long time

I haven’t written a blog post in probably a few years. During that time, I let my SSL lapse and even lost the ability to log into wordpress. Never fear, however! I’m back. I restored SSL through certbot updates, restored my access to the site (I had 2FA that I think broke due to me switching keys and losing my password to the site), and I updated wordpress so that I was back online. I also removed the Cloudfront distro that was in front of the site, because this really just doesn’t get the traffic necessary to warrant a CDN.

This site has undergone a few changes that I’ve not really highlighted over the years because I’ve been quite busy with other projects (I still have other projects, but I’m picking this one back up).

I hope to embark upon a switch for this blog over to docker as well as move to using markdown in my posts. Ideally, I’ll drop wordpress before too long and host it all natively in golang, html, css, and manage everything through CI/CD, but that’s not in the near future.

Recently, I’ve been getting back into gaming (just beat Doom Eternal a few months back and beat FFVII Remake Part 1 on normal and working my way through Hard Mode). I’ve also been doing a lot of work on my NAS, which I’ll throw an update on here shortly.

Sed, a super-short runthrough of a new trick I learned with my favorite editing tool

I’m lazy, so I like to edit files as efficiently as possible. Often, when I want to apply a patterned change on a file (like removing the “#” sign from the begging of several lines…as vim loves to insert on a line following the previous that already has a “#”), I like to use one of the most powerful tools in bash: sed.

Sed can be daunting and disastrous, but when used correctly, it has the potential to speed up config file edits and other edits that would otherwise take more time to accomplish. So, while playing around with sysctl.conf tonight, I learned a new trick…using the inline replacement (-i ‘s///’) with line numbers…particularly, a starting line and ending at the end of the file. The line numbers part was quick and easy enough to figure out…but what if you have to change a pattern across a 5,000+ line file and you don’t want to read the first 1,000 lines OR perhaps you just want to start 14 lines in?

Either way, you’re going to need to start at the arbitrary number (easy) and tell sed to read until the end of the file (the slightly trickier bit). So, how do you do this? It’s a lot simpler than it would seem. Let’s take the example from my sysctl.conf file:

sed -i ’14,$s/^# //g’ /etc/sysctl.conf

So, what does all this mean? And, where’s the magic happening?

Well, as I already mentioned “-i” is inline (meaning that it writes it back to the file instead of just manipulating standard out). The “14” is the line that I want to start with, the “,” is there to separate to the next argument (the end line)…which happens to be “$”…that’s right…”$.”

This makes sense though, as “$” in sed represents the “end of line” when you’re searching for a pattern on a line…it would make sense that it’s the “end of stream/end of file” when referencing the entire stream and not going line by line.

The “^” represents the start of the line, the “#” is a literal character, and “g” means global…i.e. if we match more than once on a line, we would fix the issue…but since it has the “^”, we won’t match more than once anyway, so it’s kind of irrelevant (the “g” is irrelevant, that is).

This made removing all the “#” symbols that were inserted by vim much faster to remove…maybe not this time that I edit that file since I had to go look it up, but for future editing sessions, this will certainly prove useful.

Internets on a Plane

I’m going to keep this one rather short. Basically, the bane of every traveller is being connected to the internet…and the prices are absolutely ludicrous for wifi on a plane. However, these plane wifi providers often have not plugged all the holes in their security to prevent you from getting out.

In fact, plane wifi providers (such as gogo inflight) are not able to completely stop all vulnerabilities as this is the fallacy of anyone who uses the internet and thinks that they’re “secure.” The reality is that nothing is ever totally secure and can always be defeated. In the case of wifi airplane providers, this is usually pretty easy to do.

I was looking into ways to do this a while back and found multiple posts about multiple plane wifi providers. These all seem to be vulnerable to at least one or two different types.

Gogo has a super easy loophole through their network (in fact, iirc they have multiple). Someone on lifehacker found that you can set ssh to listen on port 3128 on a remote server prior to hopping on a flight and then use your command line to set up ssh with a port forward to tunnel traffic through. They make this much more convoluted and difficult on lifehacker. The easy way is to use a dynamic port forward, which you can do like this:

ssh user@host -i <identity_file> -D <some port>

So, what this does is set up a dynamic port forward (the -D flag) listing your local port as a listener and using a random port on the server side and allows the server to select the port. Then, you have to set up a socks proxy in your browser and you’ll start passing traffic through your ssh connection over the dynamic port forward.

This requires a bit of setup prior to getting on the plane, but the relevant code snippet would look like this:

$ grep Port /etc/ssh/sshd_config
#Port 22
Port 22
Port 3128

This will tell ssh to listen on port 22 and port 3128 so that you can use ssh as normal when not on a plane and use port 3128 when you’re on the plane to circumvent their security hole.

Now, I could go on a diatribe on WHY or WHY not this hole is there, but it really isn’t important. It likely needs to be there for the service to function on the plane, so it will likely always be there to take advantage.

I would like to make the disclaimer that I do not condone circumventing plane wifi, but this is meant as a theoretical post that improves upon the lifehacker post discussing the same issue.

pfSense SFP+ Firewall

I’ve been experimenting a bit with pfSense as a firewall for >1Gbps networking. Comcast provides (in limited areas) speeds of 2Gbps (full duplex…ie. both up and down). Per my research, it seems they terminate this with two connections:

  1. RJ-45 1Gbps ethernet port
  2. SFP+ interface (not sure if it’s blank or if it has a MM SR, SM LR, or something else entirely)

Additionally, from their documentation and people documenting it online, to get the full 2Gbps in a single flow, you have to use the SFP+ interface (duh, that one is a given since GigE is limited to 1Gbps full duplex, while SFP+ can handle 10Gbps full duplex). That said, if you want the full gigglebits (my terminology for gigabit as it’s more fun), you need to use something that has SFP+ (or higher…but let’s face it QSFP, QSFP+, zQSFP, and SFP28 are all too expensive right now).

Based on current pricing of 10G, 25G, 40G, and 100G…the most affordable to the common consumer is going to be 10G. You can pick up a nice intel 82599 dual-port (yes, the same chip used in AWS’s enhanced networking) from Amazon (or any other retailer really) for about $160. Also, 10G switches are coming down in price as well…I picked mine up from unix surplus for $240 for a used 24-port switch. You can pick up a 16-port SFP+ ubiquiti switch for around $500 (brand new).

So, the cost of implementing 10G-based networking is rather cheap. Thus, the attempt to set up my own firewall with 10G networking (see…I eventually got to my point…Comcast+cheaper SFP+ tech = custom pfSense build with 10Gtek intel 82599 dual-port SFP+ adapter for WAN and LAN connections).

In the process of all this, I’ve run into several snags. The most annoying being the Ring video doorbell not working…but I’ve just accepted that as a loss at this time until I can revisit down the road. There seem to be a few other hiccups…all of which seem to be relatively related to the same issue…incompatibility with UDP or poor handling of UDP packets.

Diagnosing the ring video doorbell was/is tough as they don’t really give you a whole lot of information with which to troubleshoot, but after seeing enough patterns, it seems that something is going wrong in handling UDP via wifi over my unifi-AP-AC pro that pfSense is not correcting for. Somehow, I think my Asus router was doing some sort of magic to repair the packets or something because the problems are not there in AsusWRT/merlin build.

I noticed that my VPN was disconnecting every 20 minutes (client-site for work). What I discovered was that my VPN client was sending about 10:1 control packets…it would send 10 and receive one back…so, if the control packet back did not come in, it would drop the connection. This all happened around the same time that packets would stop sending over the network (ie. no control packet when expected and about the same time, ping would stop working over VPN). After some more testing, I found that this ONLY happened when I was on wifi (ie. using my ubiquiti AP-AC-PRO). When I swapped over to wired ethernet, the connection would go for hours without dropping the connection, with the same rate of control packets.

So, I came to the conclusion that one of a few things was happening:

  • Bad network cable (easy fix)
  • Ubiquiti AP-AC-PRO does something screwy with rx packets and doesn’t tx them over the eth0 link (very not good if it’s the issue and difficult to fix sans removing the AP)
  • Some funky OS-level packet drop due to flags, states, or bad checksum
  • weird wifi action with duplicate packets due to the nature of wifi and that it expects certain levels of loss and essentially double transmits as a safety precaution

Since I saw the behavior in more than just ring, it’s safe to conclude that it’s not an app-level problem (per-se) with ring…while I’m sure that the VPN client and Ring could probably handle packets better by leveraging TCP more (ie. for sending things like control packets and checking connectivity/packet counters on each end and use some fallback or workaround), generally this points to a different problem.

Anyway, only time will tell how much of a headache this is going to be moving forward and what the actual RCA ends up being, but there are a lot of cool things that you can set up with pfSense that outweighs the issues of dealing with wireless and apps that don’t thrive in non-standard environments (ie. ring as they clearly did not test a similar use-case with their doorbell otherwise there would be a knowledge article, video, etc. to help make sure that you have configured pfSense to work optimally with the Ring doorbell).

I’d like to point out I’m not knocking Ring…I’m just pointing out that not all edge cases have been fleshed out, which is to be expected when you’re trying to build a product for specific demographics. You’ll only target the largest majority you can feasibly address, which generally doesn’t include custom home-built router solutions like pfSense, VyOS, ipfire, ClearOS, etc.

The target audience will likely be using AsusWRT, Netgear, D-Link, etc. and won’t care about getting 2Gbps+ of throughput to the internet. So, at this point, I have a stable-ish firewall (we’ll see how long I can get the uptime), with a cool and relatively easy to configure ipsec+bgp configuration (though it takes some tweaks to keep it from crashing) for things like connecting your home network into your AWS account (or your corporate network) using a CGW and VGW to set up IPsec tunnels and overlay that with BGP routing to allow dynamic updating to your VPC’s ┬ároute tables AND your internal network to update it’s route tables as well (like if you were to connect a second VPC to the first VPC, which you then connect back to the “home” network or if you connect in a hub-and-spoke style and propagate routes for each VPC to one another via the pfSense box).

IPv6 works pretty quickly out of the box as dual-stack and does not take much work to configure. It has lots of packages that can be installed to enhance functionality, AND it can be installed on consumer hardware (or enterprise), which allows you the flexibility to put as much horsepower behind your router as you may require. I personally have 16GB of ram with room to grow to 32GB, 1 X520-DA2 (intel 82599), an AMD Ryzen 7 1700x (turned off SMT for better pps), AIO water cooling, 250GB ssd for storing logs, and a low power GFX card for serial/console output so that I can configure the system via keyboard should networking or other components fail. I could’ve gone and bought another rack server, but I wanted this to be able to go in the closet of my bedroom and so I wanted this to be quiet…thus the AIO water cooling.

All this said, I don’t have too many complaints about this now that I’ve worked out some kinks in the system. I just need to finish wiring my house for ethernet and fiber so that I can put this in the closet and call it a day instead of having it sit downstairs (my cable management at home is a mess right now).

…well, it’s getting late. I’ll try to write another tech blog post a little later down the line…probably will be related to docker, chef, ansible, or some other automation-like thing…maybe another post about OpenStack…just depends on what the next project is that I get some work done on.

TripleO Update

This update is going to be much shorter than most posts I make. I’ve been trying to get TripleO quickstart to work and it seems that something in the setup process is causing introspection to work. Currently, I have not solved the issue yet. It seems that this is LIKELY a dnsmasq issue, however I can’t confirm it yet as I have not fixed it yet.

…This is the life of an engineer. Hypothesis -> figure out how to prove/disprove hypothesis -> Fix problem or back to step 1.

So, WHY do I think it’s dnsmasq? Well, the message coming in from the server looking to boot pxe is that there was no boot file received. This means that the server provided an IP address, but the configuration didn’t push the pxe file to the server via tftp as per the dnsmasq logs in journald (via journalctl). There are no messages showing transfer of the file over tftp.

Ok, I swear I’m done with this one.

EFI and BIOS pxe boot via dnsmasq and raspberry pi

So, I’ve done this a few times before at this point, but I have gone through a lot of pain a few times in setting up a pxe server on my raspberry pi. Part of the need for this is when I have a machine I need to rebuild and then I have been trying to kick off rebuilds with the raspberry pi. After a lot of reading, some trial and error, and settling for a while, I found out how to get efi pxe boot working, but couldn’t get it to work with bios as well. So, it was either one or the other.

I ended up leaving it like this because I wanted my servers to be EFI moving forward as some features end up missing off of newer OSes when you use BIOS to do the pxe boot. So, I left it as is with a specific configuration in dnsmasq until such a time as I could figure out what was causing the dual pxe operation to fail. I found out the answer today and got my config working. In order to record this for my own use, I’m going to place it in a repository so that I can refer back to it later in case I need it.

# cat /etc/dnsmasq.conf |egrep -v "^#|^$"
log-dhcp
enable-tftp
port=0
dhcp-no-override
dhcp-match=set:efi-x86_64,option:client-arch,7
dhcp-match=set:efi-x86_64,option:client-arch,9
dhcp-match=set:efi-x86,option:client-arch,6
dhcp-match=set:bios,option:client-arch,0
dhcp-boot=tag:efi-x86_64,"efi64/grubx64.efi"
dhcp-boot=tag:efi-x86,"efi32/syslinux.efi"
dhcp-boot=tag:bios,"bios/pxelinux.0"
interface=eth0
bind-interfaces
dhcp-range=192.168.2.11,192.168.2.100,1h
dhcp-option=3,192.168.0.1
dhcp-option=6,192.168.0.1,8.8.8.8,209.244.0.3
dhcp-option=28,192.168.63.255
tftp-root=/tftpboot

I cut out all of the lines that are commented out. So, I want to dive further into what these lines mean. The first option turns on dhcp logging. The second line enables the tftp server. Port=0 tells dnsmasq not to act as a dns server. According to the man page of dnsmasq, the dhcp-no-override option does the following:

–dhcp-no-override(IPv4 only) Disable re-use of the DHCP servername and filename fields as extra option space. If it can, dnsmasq moves the boot server and filename information (from dhcp-boot) out of their dedicated fields into DHCP options. This make extra space available in the DHCP packet for options but can, rarely, confuse old or broken clients. This flag forces “simple and safe” behaviour to avoid problems in such a case.

This aims to clear up space in the dhcp packet while at the same time preventing confusion in older or broken dhcp clients via simple/safe behavior. The dhcp-match lines look at the information passed via the dhcp client to dnsmasq about the system to determine what type of system it is: efi, bios, or specialized systems. This one only looks for efi32, efi64, and bios systems, though you could add more. Once it matches on a client type, it sets a tag. This tag is then used to determine what boot file to send to the client via the dhcp-boot lines, which is where the magic happens.

Once you’ve told that information, you set the interface that dnsmasq is listening on (eth0). Bind-interfaces binds to wildcard (ie. 0.0.0.0/0) and will discard requests that are not coming over the correct interface. So, you end up allowing all ip addresses bound to the interface you’re using to communicate with dnsmasq.

Dns-range sets the allowable ip range for dnsmasq to pass out for pxe boot clients. The next three dhcp-options set gateway (option 3), dns servers to use for dns resolution (option 6) and the broadcast address (option 28). This allows external communication when you’re trying to grab data from public repository mirrors when installing via pxe. These options could be disabled if you have your own internal repositories, but because I’m trying to do a rather base install on these machines and I’m not yet running my own mirrors locally, I’m using this option. I don’t currently build repositories and rpms (probably going to happen sometime down the road once I’m done with setting up TripleO and the rest of my infrastructure), so I’m still allowing external repository grabs at the moment.

The final line in the config tells dnsmasq to run the tftp server that is serving up the pxe boot information out of the /tftpboot directory and all references to files are based from that directory. Which is why you see references to efi64/ and bios/. These mean to look in /tftpboot/efi64/grubx64.efi and /tftpboot/bios/pxelinux.0 for the pxe boot files to pass to the client.

The coolest part of everything is the configurations of the pxe boot server from this point. The most important thing is that I need to set the directory structure and the boot files, which I had to grab from either syslinux package or from the installer’s grubx64.efi file off the image to boot. I’ll describe in another post what other configurations are necessary, but the smart way to set up pxe boot isn’t to set up pxe boot file with multiple boot options, but to come up with a kickstart file for each mac address of your machines (ie. this creates a custom install that once you have configured will allow you to set custom configurations that you can boot over and over again when you want to reset your server back to base config. I just copied the files to the name of the mac address of the nic card and would make custom adjustments to the file once I copied the file for the new instance.

The down side to all this is that TripleO runs its own pxe boot server, so this becomes rather moot once you set up the tripleO undercloud server. More to come on this later on as well when I finish doing the undercloud and overcloud deployments. The way to manage this is to set up vlans to segment out communications so that the pxe boot servers don’t overlap each other. Additionally, you would want to set up the switch to be able to flip between the vlans by running a script on the switch to flip the vlan when wanting to hit a certain pxe boot server.

Not to get into too many details (as I have yet to work this out), the flow would work like this:

  1. Use ipmitool to set boot options on the bmc to boot pxe on next boot
  2. Set the vlan of the switch port for the nic of the server to that of the pxe boot server that you want to communicate with.
  3. Reboot the server that you want to provision over the pxe vlan you previously selected by using ipmitool power reset over lanplus. You could also execute this in the OS, but thre’s no need for it.
  4. Wait and watch to make sure that it connects to the correct pxe boot server once the install is complete, force the vlan back to the previous setting and reboot the server.

I’ll go into discussions of these things with other posts later down the line, but this is the gist of getting a home lab set up with openstack, multiple pxe boot servers capable of efi and bios booting AND setting up switches/vlans/routers. There will also be a buildlog of the router that I’ve recently purchased as it is going to be running VyOS (my first attempt at it) to maintain the 10G backbone that I’m setting up for my home lab and my home network.

TripleO Quickstart and Thoughts on Cloud infrastructure provisioning

So, I’ve been reviewing several different instruction sets for getting Openstack up and running using TripleO. If you’re not familiar with TripleO, it means Openstack on Openstack (thus OoO or TripleO). This has provided a means of doing quicker deployments of openstack. However, if you’re like me and you want to get a full HA environment up, the barrier to entry really isn’t how difficult it is to set up one, two, three, or even 100 nodes. The problem is more along the lines of how much hardware I have available.

Earlier picture of Stage 2 of HomeLab

As seen above, this is my home lab back before I moved to my new house. I still have the same four servers, but I now have a new switch and a few 10Gbps NICs (Intel 82599 x520 chips) in a few of the servers and I’m working on getting all four on the new NICs. The problem, as can be seen in the above picture, is that I have 4 servers, not 6. This makes rolling TripleO in an HA fashion not possible as the minimum node requirement for HA is 6 if you separate storage, compute, and controller. Additionally, TripleO has the added requirement of needing another machine for the provisioning.

So, why is this? Well, the way TripleO provisions is that it uses two separate parts:

  1. The Undercloud
  2. The Overcloud

So, what is the undercloud and the overcloud? The undercloud is an all-in-one openstack deployment to a single node that has heat installed in order to provision the tenant-facing cloud, a.k.a. the overcloud. As you could’ve guessed, the overcloud is the version of openstack that we ingest for all of the various services (nova, heat, cinder/ceph/swift, ironic, trove, designate, magnum, etc.). So, this is where we’re trying to do the majority of our workload.

So, if you’re the standard home user, you probably don’t want to run 6+ home servers as that can run up quite the power bill (unless you’re running on pure solar/wind/etc. and have excess power to spare…in which case…can I host my rack at your house?).

This begs the question: How can I maximize my available resources and still spin up a full-blown HA cloud. What would define an HA cloud (even a small PoC or small-scale lab)? Typical HA requires 3 nodes as 2 is more susceptible to the split-brain problem. The current iteration of tripleO deploys to bare-metal nodes and has a specific role.

So, how could we improve this? I thought this was how it worked when I was first looking into tripleO, but I envision a system that will provision all available nodes as ironic nodes that run nova. Then, the node would have images pushed to nova that carried the other roles:

  • ceph/swift/cinder/storage
  • controller

So, why do I think this would work and be a better way of deploying the infrastructure? This, in theory, could allow the entire deployment to be self-contained, and self-updated. Suppose you needed to migrate to a new version of openstack. You would launch a new image for nova on all hosts except where the controller is running and migrate the controller to the node that has not been updated. Then, once the other nova nodes are updated, you could update the controller database, which is running on the node that you migrated to and then upgrade the last nova node. Additionally, it would allow for more seamless scaling.

This could be achieved through currently existing networking constructs such as vxlan, vlan, and GRE. The tricky part here would be how to transfer the ironic nodes from the undercloud to the overcloud for management. Once that hurdle could be solved, the entire system could be self-sustaining.

At some point down the road, I’ll do a video on this concept.

If you know how to contact me, let me know what you think of the idea and let me know if you have any ideas on how to implement this. If I had more time to play with this and contribute to the tripleO/openstack projects, I would. Perhaps I will have time to do this down the road once I’m more settled in my house.