automated deployment using open source

1. How to build a fully automated (or nearly so) deployment system using open source tools. Russell Miller SCALE 8x 2010 (16,080, in case you were wondering) [email_address]

2. Overview

This talk will concentrate on a CentOS/RHEL/Fedora environment.

3. The methods discussed in this talk will use only open source tools.You don't have to pay for anything I mention here unless you want to.In which case... 4. My paypal address is... 5. My qualifications

Sysadmin/Developer for 12 years

6. Worked on many different kinds of environments 7. Maintained deployment system for 4,000 server environment for leading Internet Shopping Comparison company in West Los Angeles. 8. Instrumental in bringing up two datacenters of about 300 servers each from scratch for leading Internet Advertising company in Irvine, including building the entire deployment system from metal to application. 9. The second datacenter, once all the physical hardware was there, took live traffic in less than two weeks from bare metal. 10. What I am not good at

Managers

11. Powerpoint/OpenOffice Impress presentations 12. Funny jokes. 13. Keeping an audience awake 14. And, of course, self-deprecating humor. 15. Why a deployment system?

If you have one server, you don't need one.

16. Deploying servers manually usually mean many cycles of going back and forth with the application owners to validate.For each server.This eats up time and means bringing up servers can take up to or more than two weeks... 17. And consistency is a lost cause. 18. You've already lost control the minute the OS is installed. 19. It's too easy to miss little things. 20. System Administrators are TERRIBLE at documentation. 21. Admit it...you are.No use denying it. 22. What about server spec sheets? 23. NO! 24. Why not?

It is practically impossible to get application owners to tell you what they need if they understand what they need in the first place.

25. It's not their fault.Their focus is on the application.They are not System Administrators.That is why you are paid. 26. They do not accurately represent server builds 27. They often contain incorrect or useless information 28. Frankly, they serve no function other than to waste time and sow confusion, while providing an easily shattered veneer of repeatability. 29. So, naturally, they're usually the first thing tried. 30. How does it benefit you?

Very, very fast deployment times (metal to live server in less than an hour, and deploy as many servers as you can get open terminals to at once I've done 20 in an hour.And unlimited if you can get the serial consoles to work via expect.)

31. The code is the documentation.Server specs can never go out of sync because the server specs are actively deployed. 32. Very tight control over anything that is deployed to the servers. 33. (This means that even if someone installs something you don't want them to, you can simply have it removed within 10 minutes with no manual intervention.) 34. Repeatability. 35. And most importantly, astonished and very happymanagers. 36. So do I have your attention? :) 37. So which open source tools?

You will need the following tools:

Request Tracker (or an asset tracker with a command line interface)

38. Nictool/djbdns (Bind and any other manager will work too, but this is what I use, because nictool has a simple schema and is scriptable) 39. dhcpd 40. PXEboot 41. Httpd (you will see why in a moment) 42. A yum repository 43. And... puppet (cfengine or another configuration management tool will probably work, but I prefer puppet.) 44. How does this work?

The simple flow is:

Enter the information for the server into your Asset Tracker.The most important part is the MAC Address, though you can add other things as required.In multiple datacenters you might want a Location field, for example.

45. Make sure the DNS info is properly entered into your DNS server in whatever way. 46. Tell your DHCP server you want to allow the server to PXE boot.This can happen manually or automatically. 47. I prefer manually simply because if you set the server up to boot automatically you can get into a situation where the server accidentally reboots and rebuilds itself.This tends to make app owners unhappy. 48. And....let it install.An hour later you have a full build with no further manual intervention. 49. How does this work behind the scenes?

Magic?

50. No? 51. Guess I'll have to tell you the super-secret explanation. 52. Behind the scenes...

The RT server is the bedrock of the system.It contains all of the information needed to successfully build the system.A command line interface is absolutely required so that the scripts that actually do the build can get access to the info.

53. For example, at a minimum you'll want to put the MAC Address into the RT system.You may even want to populate DNS from an IP field.Every step of this process uses the info from RT. 54. There may be site-specific stuff you need to use.Don't be afraid to add or use it.This is only a framework. 55. Behind the scenes

You will next need a script to build the pxelinux.cfg file for pxebooting.This script pulls all of the required info out of AT (such as the MAC address, location, etc) and generates the appropriate file.This is a custom script and is by no means one size fits all, but is fairly easy to write.The output of this script is a working pxelinux.cfg file.

56. What about kickstart?

Oh, here comes the genius part.(And I can say that because I didn't invent it, but I have it down to an art!)

57. The kickstart file is not a file at all.It is a CGI script.It goes to RT and DNS and gathers all of the information required, makes decisions on how to build the servers, and then custom generates a kickstart file. 58. It should at minimum take one argument the RT asset ID.This is a unique identifier and allows all the information to be pulled out of the asset trackerto be used in the script. 59. Be careful!

This script is one of the bedrocks of what you are trying to do.It is also dangerous.It is dangerous because you are essentially trying to generate a script in one language (kickstart/bash) using another language (perl, python, whatever), so it rapidly becomes unmaintainable.

60. I recommend something like Template::Toolkit to make it more manageable. 61. Build maintainability in from the beginning!You may not get another chance! 62. Yum repository

You'll also need a yum repository at this point.

63. DO NOT USE EXTERNAL REPOSITORIES. 64. Pull down an internal mirror and use that. 65. The reason for this is:control.If you use external repositories, you are putting control of releases and upgrades into their hands, not yours. 66. And while Centos, Fedora, etc., are fairly good about it, they make mistakes and you do not want your production site to go down because of someone else's mistake.It's still your fault for not taking my advice. :) 67. Control

As you might have gathered from now, as an aside...

68. I am a control freak. 69. At least when it comes to System Administration. 70. But this is a good thing... 71. Because if you are in complete control of your environment, you reduce the chances of surprises. 72. And surprises are your worst enemy. 73. ... well, maybe not your WORST... 74. Check-in

At this point, you have:

An RT server with all of the info you need

75. A pxelinux.cfg generation script that will point to a custom kickstart script, which pulls the necessary info from RT.. 76. ... which is generated by a CGI script, which pulls the necessary info from RT.. 77. And a yum repository which has all of the packages you need for a kickstart install. 78. Congratulations...

You can now build a server.

79. But what about configuration management and application deployment? 80. Oops.Looks like there's more to do. 81. The rest of the story

Now that you have a server that is up and on the network, it needs to be configured.

82. Each server will likely have a base config that every server needs.For example, snmp, ntp, etc., etc. 83. But each server also has an individual role.Application server, database server, facebook browser, Quake Server, pr0n datastore... 84. Can you deploy these roles automatically too? 85. YES! 86. Configuration Management

For this task, you will want a configuration management system.

87. I use puppet. 88. Puppet is a configuration management system 89. It controls what is deployed and what is NOT deployed. 90. It can deploy a package to one or a thousand servers at the same time. 91. And it slices, it dices, it writes bad checks... 92. Facter

Facter is puppet's best-kept little secret.

93. Facter executes little bits of ruby code in order to determine facts about the system.OS release is one example, etc. 94. But the facts that it determines are not limited to that... 95. The snippets of ruby code can also call AT and pull facts out of AT and make them available to puppet. 96. AT and facter

This gives you the ability to dynamically decide what puppet installs simply by changing fields in the asset tracker.

97. Putting it all together...

Here is a sample workflow to building a server.This is not theoretical, it is working at my workplace.

98. Put server in DNS, and in AT. 99. Set the fields appropriately.For example, Server Function is DB server, build is Centos 5.4, network role is vmware, mac address is correct. 100. Run a command to add server to dhcp server (pulling info from DNS and AT). 101. Reboot server. 102. Wait for it to build out. 103. Sign puppet certificate 104. Wait for puppet to run 105. Hand off fully built server . 106. Can it be more automated?

Indeed.

107. For example, you could automate the server restart by using a serial port and expect.For many enterprise-class servers this isn't reliable, but it'll work for some.

Many servers use CLP-SM.This is great for consistent command lines.Doesn't work so well for programmatic resetting.

You could also set AT to automatically kick off a rebuild on setting a field. 108. You could have AT automatically populate DNS using the SOAP client for nictool... 109. It's not out of the realm of possibility that you could set up a server from bare metal all the way to application deployment simply by setting the fields correctly in AT and then setting a special field using this infrastructure. 110. Questions?

automated deployment using open source

Technology