Subscribe by Email


Thursday, July 18, 2013

Infrastructure: Need to do a frequent clean up of old builds from servers ..

This is a constant problems that organizations that have large products face. If your product has a size of around 500 MB or more, and you are in the development face, the chances are quite high that you will be generating a build every day. The build contains new fixes and code changes, so getting a new build every day ensures that there is a quick turnaround in terms of defects closure as well as ensuring that new features done by the developers get to the hands of testers almost as soon as the features are done.
However, there is an infrastructural issue involved with getting so many builds. I am talking of the cases where the typical release cycle of such a product is more than a few months. During this period, the team will generate a number of such builds that need to be hosted on servers so that they are accessible to the team members (and they needed to be transferred to additional servers if the team has members located in different geographical locations, or if there are vendors located in a different location and the vendors can only be given access to a different server outside of the main server that is accessible only to employees). Now, there can be additional builds that are required (for example, you may have the same application that has a different structure for the DVD release vs. the release on the company online store, and if there is distribution to vendors, there might be a different version). In some cases, the different language versions of the product might need to be different applications (builds) which also increases the size considerations.
Now, all these place a lot of constraints on infrastructure. Central servers are typically set up in a RAID kind of configuration which means that space needed is actually much more. Server capacity in terms of hard disk is cheap, but you can be pretty sure that at some point, unless you do some optimization, the additional hard disk capacity required on a regular basis can start becoming costly, not only from the perspective of equipment, but also from the perspective of the staff needed for maintaining this capacity, as well as making it more difficult to find what you want. It always makes sense to do some kind of optimization of the storage needs of the product in terms of builds, since if a build is not being used, the need of storing the build is unnecessary.
A initial thought might show that only recent builds need to be stored, but that is an over simplification. There might be defects that were filed some time back, and for the purpose of evaluating these defects, the builds on which these defects were found need to be accessible. Further, during the process of coding, there could be errors inserted into the code, but not detected for some time (this could be weeks or even months). Even though the code could be checked by doing a differential in the source code repository, it may be necessary to test the performance in the build in which the code change was first detected to see what the code change caused. There could be numerous such reasons as to why a specific build is needed at some point in the future, and hence, there needs to be a process defined that will let the team control which builds need to be deleted from the server, thus leading to an optimization of the server space. Here are some points that could help in this:
- If there are builds from the earlier cycle, then it is probable that those build are not necessary. It might just be enough to retain build that were of significant interest in terms of milestones.
- If a build had a problem in terms of the build either not launching or being rejected for the purpose of testing, those builds need not be retained and can be deleted
- When builds are older than a few months, the team can decide on a policy to check whether such builds can be deleted or not, and so on.


No comments:

Facebook activity