Logging onto, particularly production machines, is not a DevOps best practice, so how can we get logs to diagnose faults?
Particularly with Production systems, it’s a poor DevOps practice to login to machines to make changes. Ideally, everything should happen through your automated systems so there is no need. This prevents little ‘tweaks’ being done that might fix a short-term issue but long term causes drift in the platform with possible repercussions later. It also means that if a new system is required, we might not actually have recorded all the setup required because a fix was put in outside of the automated deployment processes. This makes ongoing support and maintenance much harder.
So, we limit access to machines to stop this exact kind of issue. Nobody can just ‘tidy-up’ while they are there anymore. Or just make that little configuration change to make it a ‘comfortable’ place to work.
However, in order to determine the cause of an issue on a platform, developers often need the appropriate logs and dumps to analyse.
So how do we reconcile the need to keep people off the servers but also provide access to the necessary data for fault analysis?
The same can also be true if you have a support case with a 3rd party software vendor. They will need data to resolve the product issue. IBM Support for example often has a ‘must gather’ bundle for products and it’s a lot easier if these are automated.
OK, so we can quite easily create ourselves processes to capture the necessary logs, dumps, configuration etc, but where to put that bundle? I need it to go somewhere that my agent can write to off of the server and I don’t really want to have to get firewall ports opened.
So why not use the capabilities of UrbanCode to upload the bundle and make it available through the UCD GUI for download and subsequent analysis?
The technique I describe here uses a UCD Component version to hold the uploaded logs. In this example I created two components:
- Log Receiver
This component will receive new component versions containing the ‘mustGather’ assets. Nothing is ever deployed from this component and the component doesn’t need to be part of an application. This component could be shared by many applications or as appropriate in your organisation.
- Log Sender
This is a component that holds the must gather process, but you could put it in any component that is part of your application or use a generic process for example.
The Log Receiver component is setup to have its own clean-up rules to avoid it accumulating lots of old stuff and taking up valuable CodeStation disk space. I set the Days to Keep and Versions to Keep both to 1 in the components configuration. So the UCD daily clean-up will remove all but the most recent logs in the event they are not removed in a timely manner.
I created this simple component process (Operational (Without a version)) to capture the logs and upload them into a version of the Log Receiver Component.
The first and last steps just clean out the agents working directory to avoid uploading old information or holding disk space unnecessarily.
The second step is a shell step in the example to gather together all of the files I want to send back in the ‘mustGather’ package.
The third step just creates a zip file of the whole set, so we have one bundle. I named the bundle to include the application name, environment name and name of the agent doing the gathering. Eg
The fourth step creates a new component version and uploads the bundle into codeStation. I named the version in the same manner as I named the bundle so that in the event you are sharing the Log Receiver component you can easily identify the application from which it was gathered. Of course, you could name it however you like. In my example I’m relying on the dates / times on the UCD Version to show me when it was captured. But you could include that in the name of the zip file if you desired.
I next created an application process to call my component process. If you want to run the process in a different way e.g from a Generic Process or directly from a Component process then you may have to use different attributes to form the bundle name.
When I look at the versions in my Log Receiver component I see:
I can drill down into the component version I want and download the zipped bundle
After I’ve retrieved my log bundle, using the Download link on the right, I can delete the component version to reclaim the disk space. If you forget to delete it, UCD will remove it up in its daily clean-up cycle. Since these versions would never be deployed, there is nothing to hold them. They should go within 2 cycles.
So, a simple solution to avoid having to let people log in to servers to retrieve logs. We don’t need a ‘man-in-the-middle’ to pickup diagnostics data on behalf of developers or others with the delays that usually entails. It can be self-service.
Alan Murphy is an IBM services consultant who has worked with clients to help them adopt new tools and processes for the last 20 years. UrbanCode Deploy and DevOps has been his focus for the last 5+ years. He also develops tools to assist clients in tool adoption and blogs on an occasional basis.