search results matching tag: Spam

» channel: learn

go advanced with your query
Search took 0.001 seconds

    Videos (72)     Sift Talk (61)     Blogs (13)     Comments (1000)   

10 Fitness Tips to cope up with Coronavirus

How to Protect Your Vehicle from Theft

Top 4 Eyeglasses Lenses Prescription Plugins for Woocommerce

Top 4 Eyeglasses Lenses Prescription Plugins for Woocommerce

5 Safety Tips for Working with Industrial Tools

PersonalMoneyService

The Worst Typo I Ever Made

StukaFox says...

The worst DevOps mistake I ever made:

Assignment: On ~1,000 -physical- RHEL systems, change the default run level from command line to GUI (don't ask).

Solution: Hey, all our config files are controlled by Puppet, so this'll be easy!

(If you don't know what Puppet does, it enforces file configurations, so if you change a single file on the Puppetmaster, that change is pushed out to all servers running Puppet)

Ok, all I need to do it edit a single file, change a single number in said file and issue a single command: reboot. Easy-fuckin'-peasy.

The file I need to change is /etc/inittab -- this file tells a Linux system which "run level" it should initiate upon booting up. runlevel 3 is command line and runlevel 5 is a GUI like Gnome or some other tragic perversion of the whole reason you run Linux in the first place. All I had to do was change from runlevel 3 to runlevel 5. And reboot.

So simple; so stupidly simple.

So stupidly simple at 3:00am. When I hadn't slept all night. On a production network. When I'm working from home away from the office. On a Saturday when no one is in said office.

I make my change and save it, then push it to the version control system. Puppet picks it up and pushes the change to ~1,000 physical computers.

Done and done!

Remember I mentioned that I had to change a single file AND execute a single command: reboot?

Here's where things go tragically wrong.

My changes worked PERFECTLY. Everything did exactly what I told it to: Puppet changed the file, and rebooted the servers.

Only they keep rebooting. They keep rebooting over and over and over and over. I can't access any server on the network. Worse, while I'm trying to figure out WTF I did wrong, the 30 minute time-out I'd set on our alerting system, Nagios, expires.

Did I mention that I pushed this change to ~1,000 servers? ~1,000 servers that won't stop rebooting and aren't reporting into Nagios, thus being marked as down?

At 3:31am, on Saturday morning, the pages to ALL the on-call engineers began. One page per engineer per machine. About one every two seconds. And I'm getting paged, too -- except some of the pages are Nagios and some are utterly irate engineers who want to know exactly WTF is going on and I can't tell which is which because I'm getting text-spammed like crazy.

And those servers? They just keep right on rebooting.

At that point, I felt the kind of existential dread that only people who work in IT know -- the kind of dread that arises a picosecond after you've hit ENTER and realized you've type 'rm -rf /' or some-such -- because I knew at that very second exactly what I'd done wrong.

I'd typo'd "5" and made it "6" in the runlevel. And pushed it to ~1,000 -physical- servers. And then rebooted them ALL.

"So," you're asking, "Whyfor is runlevel 6 a big deal?"

Because of this:

runlevel 3: command line.
runlevel 5: GUI
runlevel 6: REBOOT THE FUCKING COMPUTER.

What I'd done was told every production server on our network to reboot as soon as it rebooted, which leads to another reboot, which leads to another reboot, lather rinse repeat.

At 3:45am on Saturday morning, I knew that every person in IT would have to drive into the office, visit every production server with a bootable USB key, change the BIOS to boot off the key, boot the server into Single User Mode, change the damned file by hand, then reboot the server. This takes about 10 minutes per server -- times ~1,000.

I learned a number of valuable lessons that day:

1. DOUBLE CHECK YOUR FUCKING WORK.
2. See lesson #1
addendum: filing for unemployment insurance in Washington state is amazingly easy.

And that was the very last time I ever worked on physical hardware. To this day, if it's not in the cloud, I ain't fucking touching it.

Here endth the lesson.

Dock Builder Lake Oswego, OR - KC Marine LLC

Happy Chihuahua

Best Timeshare Exit Company

The 6 Best Technical Indicators FOR DAY TRADING

BEST STORY PRE WEDDING SHOOT | KANIKA & BISWAJIT | SUBODH BA

Precision cutting tools by Suncoast Precision Tools

Best CBD Topical Creams

VA Home Loan Massachusetts by Nextgen Mortgage



Send this Article to a Friend



Separate multiple emails with a comma (,); limit 5 recipients






Your email has been sent successfully!

Manage this Video in Your Playlists

Beggar's Canyon