Subscribe by Email


Showing posts with label Data Analysis. Show all posts
Showing posts with label Data Analysis. Show all posts

Thursday, March 17, 2016

Supporting previous versions of the software (Part 2)

Well, actually, the topic of this post should be more like, When to drop support for the previous versions of the software. In the previous post (Supporting previous versions of the software - Part 1), we did a top level summary of the problem, and what are some of the approaches one should be following.
In this post, we will talk about the usage of data analytics to determine the number and percentages of users who are using previous versions of the software, and how to use these data analytics to work out whether to continue support or not. Say, if the data is able to determine that there are only 1% of users who are on a version that was released about 5 years back, then it would really help in the decision making. One of course needs to keep in mind that the data cannot be the only factor in determining the dropping of a support, but it is very useful to have such a data rather than trying to make a decision without this kind of data.
How does one get this kind of data ? Well, it is fairly easy to make the application to be phoning back whenever the user launches the application. The data phoning can be incorporated in various different ways - it can be incorporated to repeat the first time that the user launches the application, it can also track how many times each feature was launched in the application, for how much time, what were the workflows involved, and so on. This data coming into the applications data gathering mechanism can be tweaked to do whatever kind of analysis is required.
This data provides a great amount of input into the decision making process. For an application that has around 10,000 active users across versions, if there are only say 100 users working on an application version that was released 5 years back (and during the year of release of this version, there were around 900 users); it is possible to make a decision that support for this software version could be dropped. In many cases, the product team could try to entice these users on previous software versions by offering them a discounted upgrade path to the newest version.
However, using data analytics comes with its own challenges. There are many cases where there are challenges in data collection, or in data analysis. It needs to be very very sure that there are no errors during this process of data collection and analysis. And there are legal issues that need to be settled. This concept of sending data from the application needs to ensure that there is no violation of the privacy of the user (there can be heavy penalties in case it is found that privacy of the user has been violated). Hence, the functionality of this kind of data collection and data analysis would need to be cleared by somebody in the organization who has the authority to clear such kind of privacy potential issues (can be the legal advisor in the company).
Read the next post in this series here.


Monday, February 23, 2015

Tracking platform usage and making decisions based on that

Even in today's data based world, if you are an analytics expert, you can't expect to be totally popular, or that people will welcome you with hands outstretched. There are many aspects of a product development cycle that would benefit from integration with a data analytics cycle - generating the questions, collecting the data, and the extremely important task of generating the output information and conclusions (and a wrong conclusion at this point can have negative implications for the product - putting in effort in wrong or unimportant areas). However, consider the case where there is support for using analytics and there are resources dedicated for ensuring that requisite changes to the product can be made (based on the analytics output).
One major area where the team needs to gather information is on the area of platform support. Platform could mean the Operating System (MacOS, Windows, others), Versions (Windows XP, Windows 7/8, etc) as well as browsers support (versions of Chrome, Internet Explorer, Firefox, Opera, etc). These can make a lot of different in terms of the amount of effort required for product development. For example, on different versions of Windows, the versions of systems files are different and it is a pain to support these different versions. There can be small changes to the functionality of the system based on these files, and in many cases, the product installer would actually detect the Windows version and install the required versions of these systems files. If you could find out that the number of people using these different versions, and find out that one of these versions is being used by a small number of consumers, then the decision to drop that particular operating system may be taken.
So, this is one of the areas in which analytics can work. The simplest way to do that is to make it part of the functionality of the product, where the product, on the user's computer, dials home and provides information such as the platform on which the product has been installed. Once this kind of data has been collected and a proper analysis is done, then the product team can look at this information and factor that into the feature list for the next version of the product. The team will need to decide on the time period for which the data would be captured, and also some of the benchmarks which will decide whether data from the analytics effort needed to be used for making decision (for example, the data seems to be such that people feel that the data is widely different from public perception).
However, this is also dependent on how much the team can depend on this data and the analysis; after all, even a small variation during the analysis can result in information that has levels of inaccuracies in it. But, it is necessary that the team spends effort in the analytics effort, since the payoff from using accurate data analysis and interpretation is very high.


Thursday, July 25, 2013

Defect Management: Dividing overall defect reports into separate functional areas - Part 3

This is a series of posts where I look at the creation of a defect status report, in a way that it provides enhanced information to the team and its managers and helps them to make decisions. In the previous post (Count of new incoming defects), I talked about adding more parameters to the defect report that help in information that can let the team know whether the number of defects getting added on a daily basis will enable the team to reach their defect milestones by the required time and date. This kind of data, coming in on a regular daily cycle, helps the team to decide whether their current defect pattern is on the desired path, above it, or below it, and accordingly make the required decisions.
This post will see more details added to the defect report that provides a lot of useful information to the team. The last post talked about the flow of incoming defects that move to the ToFix state. However, there is another type of information that is relevant. Defects that are present in the system are not only owned by the development team. In addition to the defects in core code, there may be defects that are present in the components used in the application. These are defects that are not attributable to the development team, but to the vendors or other groups that provide the components. A number of teams that I know typically track these defects in the defect database, but distinct from the defects with the core team.
The nature of defects that are against external components is different from those in the core code. Even though to the customer it does not matter whether the defect is within the core code or in an external component, the amount of effort required in terms of coordination and communication is entirely different from the other defects that are with the core developmental team. If a defect is with a component that is not owned by the team, the timeline for fixing of the defect may take longer and need persuasion; or there may be a lot of back and forth between the tester and the external component team to study the situation in which the defect occurs (which also includes sending the environment in which the defect occurred to the external vendor - and this has its own cost and restrictions, since if the team is working on a new version of the software, there would NDA issues and IP issues related to sending details of the environment to the external component team), and so on. Another concern could be that that even if such a defect is resolved, it might need a new version of the component, which has its own extra cost about testing the component on its own to check whether it is fine or there are other issues with the same.
As a result, it needs to separate out the incoming defects about whether they belong to the core team or whether they are attributable to people outside the team; and if the proportion of such defects that are outside the core team is increasing, it is a matter of concern to the team, since resolving such defects typically takes much more effort and time.


Wednesday, July 24, 2013

Defect Management: Dividing overall defect reports into separate functional areas - Part 2

Part 1 (Dividing defect reports into ToFix, ToTest and ToDefer counts) of this post talked about the importance of Defect Management in a software project, and then got into some details about the regular sending out of a report on Total defects, with these defects having been broken down into ToFix, ToTest and ToDefer stats, maintained on a graph over a period of time with daily updates, so that the team and the managers can figure out whether the team is on progress to resolve these bugs.
This post continues on this line, talking about additional items that can be added to this defect chart and metrics to provide more information to the team and determine whether it is on the right track or not. Are all these metrics important ? There is a lot of comments about not over-burdening people with too many statistics, and there are more comments about letting people do their work rather than sending so many statistics that they stop looking at these stats. However, it is also true that the role of the team managers is to look at the broader situation in terms of project status, and the defect management is an important part of this. It is true that the team members should not be burdened with these figures, but for the team managers, it is critical to look at such data.
So, the team looks at the ongoing figures for defects in terms of ToFix over a period of days and tries to determine whether the team is on the right track or not. So what else should you be capturing ? Another metric that can now be added to such a report is about the number of defects that are still incoming. There are primarily 2 ways in which defects can be added to the count of developers:
- New defects that are logged against the development team and which add to their count and to the overall ToFix count
- Defects that have been rejected by the testing team after they have been marked fixed by the developer but there is a problem in the fix (this can vary a lot among different teams and even in a team - a developer could be fixing defects with hardly any returns and there could be another developer who is under pressure and many of whose defects are returned because of some problems). So, whether to determine this kind of statistic and calculate metrics for such a case determines of whether the team is seeing such kind of returns for the defect management.
Once you have these kind of defect counts, it helps in determining the current status of defects and see whether the team is on the right track. So, you have a total count of open ToFix defects, and there is a decline in such a count needed to hit the deadlines. However, for getting to such a deadline, you need the number of incoming defects to be also fitting into this strategy. If there are a large number of incoming defects, then the team will not be easily able to determine whether their ToFix defect count is decreasing by the amount they want to hit their targets, and this then needs a change to the strategy to determine whether the team will get there or not.


Tuesday, July 23, 2013

Defect Management: Dividing overall defect reports into separate functional areas - Part 1

Handling defects is one of the major efforts that plays an integral role in handling a project schedule and making it successful. I have known multiple teams where the team did not have a good running estimate of their defect count and the defect estimation over the remaining period of time left in the schedule; as a result, when the team was closer to the final stages of the schedule, they found that they had too many defects that made the remaining part of the schedule very tight - which meant that if they were to do an accurate reckoning of their status, they would need to either defer more defects and maybe end up with a product that is lower in quality; or the product would need to extend their timeline / schedule, which has a huge implication  for the team and many other teams that are involved in the release schedule of the product.
How do you avoid this ? The first paragraph of this post points out a huge problem, but the answer cannot be handled in a single post; it can be handled by a single cheesy phrase but which does not provide any solutions - "You need to do Defect Management". Now, let us get down to the meat of this post - this post just takes a specific aspect of defect management - sending a split of the defect counts as per the different areas. This in turn provides a better view of the defect picture to the team and helps in the process of overall defect management.
We wanted to have a system whereby we could track the counts for each of the separate functional areas and yet have the management team have access to these data on an ongoing basis. These also helped the separate functional teams do a targeting of the counts of the defects of their respective functional areas and work towards reducing this count. So, we took the overall data for defects for the product (open defects) and split these into the following areas:
Open defects:
ToFix (these are primarily defects owned by the development team, although there could be defects that are carried by other team - such as where there are defects with components supplied by external teams)
ToTest (these are primarily defects owned by the testing team, although since anybody can file a defect within the team, there may be people other than the testing team who own a defect)
ToDefer (the exact terminology of these defects can be different across organizations; but these are typically defects that are with a defect review committee for evaluation. These can be significant defects that need evaluation by the committee before they are to be fixed, or these can be defects that are not worthy of fixing but the team wants the committee to take a final call, and so on).
What is the purpose of sending out separate stats on a regular basis ? These data, when plotted on a regular graph over a period of time provides a large amount of information. The team and the managers, although they are in the thick of things, sometimes need to see such kind of aggregate information to take a good decision. For example, if the team is in the second part of the cycle and close to the timeline, and yet the ToFix graphs do not show a declining trend, then this is something to worry about. When such a stage happens, I have seen the development team manager doing a major discussion with the team to figure out how to reduce these counts and figure out what is happening. In extreme cases, I have seen the team actually take a hard look at these defect counts and then make a recommendation for extending the schedule (which is not a simple step to take).


Tuesday, June 11, 2013

Analytics - Measuring data relating to user information - Part 7

These series of a posts are about analytics, the measure of information related to a user. The last post in this series (Measuring data relating to user information - Part 6) was about an error in data collection as well as the pitfalls of having a strategy that depended on business decisions being made just on the basis of data collected through analytics. Analytics should be one of the pillars of a decision making strategy, with market research and other factors also contributing to how and why a decision should be made. If you jump into the field of making decisions without having a proper strategy relating to decision making, then there is a good chance that the decision making could go down a path that is faulty or inaccurate.
In some of the previous posts, I have been taking some examples that show where decisions with regard to the product can be taken on the basis of data collected from the product in the hands of customers. In this post, I will take another example of the same - a case where there were certain reports from the customers and the product management was not sure about whether the feedback was proper, and looking for more evidence to substantiate or confirm the problems. So, for this example, there was feedback from several channels that talked about how customers were perceiving that the quality level of the product was not as good as previous versions and this was manifested in more cases of crashes.
Now, this is good feedback, but can you take this as gospel truth. On the surface, this seems like clear feedback that you should recognize and take action accordingly. So, there should be some kind of investigation that would cause you to behave differently from how you have behaved after previous releases, since that will be in customer interest. You would need to commit more effort to investigation and solutions to quality problems; and even though this might seem to be in customer interest, this is a cost to the ongoing product development. And there is the contra view - with more means of expressing discontent such as user forums, community groups and the same, there is the possibility that the quality level is the same as that of previous levels, it's just that the information collection systems are catching more of this discontent.
Now, you are stuck. Both sound good, but you need to take a decision one way or the other. This is where data collection from user systems works good. One of the first items you should be capturing is where information about whether the user application has closed normally, or whether the application has closed abnormally (such as in a crash or when the user was forced to terminate the application after it was hanging) and also capture this information for operating systems that are supported by the application. Further, you would need to do this for the different features and dialogs that are present in the application. Once you have been capturing this information, there is a lot you can do in terms of determining how often a feature or the entire application crashes in the hands of the users, where does the crash happen (this will need a lot of development effort though to determine what causes the crash in the application). This will also help to determine whether the frequency of crash is more than in previous versions of the application release.
Once you have such a data and the data has checked out to be accurate in some respects (for example, if your testing team is getting crashes in reasonably similar areas, then it helps to confirm this data to a large extent), you can make product level decisions. If that means that you need to spend some time on product stability and quality, then you need to do so; otherwise if the quality level seems fine, then you know that the information you are getting from the customers needs to be handled through regular support mechanisms and does not need the development team to spend extra effort.


Tuesday, May 28, 2013

Analytics - Measuring data relating to user information - Part 6

This is a series of posts that talk about the use of analytics in your software application. My experience is more in the nature of desktop applications, but a lot of what has been written in the posts earlier is also related to analytics for web application; there may be some differences, but the need for analytics is the same, and the decisions that can be made on the basis of analytics are the same. Some of the guidelines and warnings are also the same; in short, whether you are working on desktop applications or web applications, the tools may be different, but there is a strong need to ensure that you have designed a strategy for the same, and not doing this on an adhoc basis. In the previous post (Analytics - Measuring data relating to user information - Part 5), I talked about a problem where the team had made a strategy to collect data, but there were not enough people to actually analyze the data and take decisions based on such decisions.
However, there are some pitfalls when it comes to analytics, and taking decisions based on that. There is a joke about a person who would scream for data for every decision, whenever there would be a need for any decision or the planning for taking some decisions, there would be a hunt for data, and if the data was not present, then there were high chances that the team would be sent for such data. This is a joke, but I have seen managers who get too data-oriented. This may be anathema to those who are firm proponents of analytics, but there can be 2 problems with an analytics oriented approach.
- The data may be incorrect
- There may be so much emphasis on data, that it crosses a limit and common sense is lost

Sometimes these 2 problems can also intersect, but let's take each of them separately. You cannot just wish for data to happen - this is a very obvious statement, but it goes to the heart of the problem. We had a situation whereby we were collecting some data sent by a particular dialog in the application, and the data was coming in beautifully. A full release went by, and nobody thought much of the code in the particular function. However, in the next release, there was a defect in that section of the dialog that also affected the data that was collected, and the developer who was debugging that area came across something puzzling. It turned out that the data collection code did not enter one of the areas in which the application went into, which we speculated was around 15% of the time in customer interactions, but we had no real data. The net result was that we understood that our data for that particular dialog was understated by a percentage, but we did not know that particular percentage accurately. Hence, any decision that we made on studying the data from that dialog had a margin of error that was unacceptable. We reviewed the test cases and their execution from that particular time when the code was being written, and realized that because of a paucity in time, the testing for this particular part was not done as it should have been. The learning from all this was that data could be incorrect even with the best of efforts. And this takes us to the next para, although not directly.
Basing business decisions on data analysis can be great if you have the correct data, and can be suicidal if your data is incorrect. Further, when important decisions are being taken, it is important that there be some sort of confirmation, or that data is used to confirm some decision rather than being the driver of the decision making. So, suppose the business end of the application wants to run a campaign based on their observing of the market information they are getting, analytics could be of great help in confirming some of the assumptions that the team is making as a part of this decision making. But, using only analytics as the base on which to make decisions, or creating an environment for the same is not recommended.
Even when collecting data, there should be a thorough analysis of the data and the data collection methods to ensure that the data that is collected is correct; in fact I had a colleague who was in favor of analytics but had also been burnt before. His advice was simple - when you are getting data from analytics, assume that the data is wrong and then prove that it is right and then use it.

Read the next post in this series - Measuring data relating to user information - Part 7


Thursday, May 23, 2013

Analytics - Measuring data relating to user information - Part 5

This is a series of posts relating to the measurement of user information and analyzing the data related to that user information. In a previous series of posts, we have looked at some examples of data measurement, and how to use this data analysis for decision making. In the previous post (Measuring data relating to user information - Part 4), I looked at the other side of gathering data. One can go overboard in collecting data and start collecting information which would get the product and the organization in legal trouble and face protests from consumers. One always need to ensure that the data being collected has been cleared by the legal team or others authorized to ensure that the data collection does not go beyond what has been legally permitted.
This post will cover a common experience that people face when in the business of collecting user information and then trying to make some sense out of it. Even large companies face a similar problem - the problem of not doing anything based on this data. Seems very strange, but this happens a lot. Recently I had a discussion with an analytics consultant. He has been in the field for a decade and a half now, working for organizations and working independently. The biggest problem he faced was of getting people to commit for either getting enough resources to do the data collection, or even more, of analyzing this data to make informed decisions.
I have seen this problem myself. The software applications that we worked on had a large amount of instrumentation done in order to collect data on many different parameters. This was done after careful design by a team comprising the product manager, developers and testers, with design of what to do with the data once it is collected after the product has been released. However, it should be well understood as a part of the design for analytics that there needs to be effort to analyse the data and make sense out of the data. And this is where the biggest problem was. Resourcing was always a constraint, since analytics was never a priority over features for the product, and unless there is a change in this attitude, things will never change.
So, what used to happen ? All the data would be collected, would be a huge database of information that we were collecting for each version (and which would grow over a period of time while we made additions to the data collection techniques for the features in the application), but at the most, we would take one set of data and try to make sense out of that.
The team was also able to see this, and hence, the next time a discussion would be called for trying to to figure out what more data needs to be collected, what are some of the reports that would be useful, there would be less interest from the team in trying to be involved in the analytics for the ongoing version. This was getting to be a big problem, but there was no clear solution to this. The pressure of resourcing vs. feature work was not going to go away, and unless analytics was seen on priority with other feature work, things would not really change. And this is where teams and especially their managers need to be more immersed in what all can be done through analytics, primarily the advantages for business decision making.

Read more about this in the next post. (Analytics - Measuring data related to user information - Part 6)


Analytics - Measuring data relating to user information - Part 4

This is a series of posts that focus on analysis and its importance in today's world. In the previous post (Measuring data relating to user information - Part 3), I talked about using the data from previous versions of the software to determine trends, and to use those trends to make major business decisions, and also use this data to complement trends seen from other sources of information.
In today's world, analytics can play a major role. I still remember an article (read the article) which described the power of analysis in today's world, and how it can ever surprise non-industry people. The same article also highlighted the power and problems posed by Analytics in terms of privacy problems. If you read this article, you would be whacked by what data analysis can reveal, and also be shocked to some degree about what data reveals and whether you are comfortable with this kind of information about you being deducted.
And this is the main content of this post. If you are starting to collect data about your customers from within your software, you need to be sure that you are not overdoing it. Once you start designing what all data you are going to collect, you need to ensure that there are hard lines that set the boundaries for the data you are collecting. If you are using a component, you need to ensure that the component respects the same kind of data privacy constraints that you are using.
This might seem like going too excessive, but keep in mind that privacy is a big deal. Most software development teams are not equipped to determine as to what is proper or not. This was brought painfully clear to me when we let the development team design the analytics capturing process, including all the information that was supposed to be captured, and then, when we met the legal team, they junked more than 25% of the data that we wanted to capture. We did not like it, but there are certain boundaries that are required. The opposite is not worth talking about. You could go beyond the privacy guidelines and even implement them in your product and release them, and then there is a chance that somebody detects that you are capturing some information that is deemed as personal, and you are stuck. So, it is always recommended that once you get into the area of capturing information for analytics, make sure that it has been confirmed by somebody who is a privacy expert, which could be a legal person, or could be somebody else.
There are even more worries, especially when it comes to your product selling across geographies. A country or region may have different privacy standards when it comes to collecting information from the user's machine (for example, the European Union has far tighter guidelines when it comes to privacy) and you need to ensure that you are not falling foul of one region by following the standards of another. For example, a privacy guideline of a region would be to insist that the users know about the information that it to be collected, as well as have given their permissions for the same.
But, this is not to say that you should start getting scared about capturing user information. If you have a verified system of guidelines and are following those, you should not be worried about this. Make sure that you are doing your best to capture the relevant information, and you can learn far more about your users preference than you expected, which will help you make the best decisions about your product.

Read more in the next post (Measuring data relating to user information - Part 5)


Wednesday, May 22, 2013

Analytics - Measuring data relating to user information - Part 3

This is a series dealing with analytics, and the advantages that it brings to the product team. However, any movement into the area of analytics requires a lot of careful thought, and needs time to be spent on the design of a strategy around Analytics. You cannot just say that you want analytics, and put in place some strategy. But when you do get your Analytics strategy right, there are a lot of benefits that are possible. In the previous post (Analytics - measuring data related to user information - Part 2), I talked about a scenario where a team wants to find out the video formats being used by its users in the application, and the benefits of making decisions based on this knowledge rather than making guesses (which may be right, or could be wrong). In this post, I will write more about the usage of analytics while making decisions.
Consider the previous post, which talked about which video formats are popular. However, consider that you need to make decisions about the future, which means that you need to do much more analysis about the data you are getting. So, if you have been in the game for many versions now, don't just look at data for the previous release. Instead, if you have been gathering such data for the past many releases, you need to make an effort to review this data for the past many releases in order to figure out the best possible method ahead. So, even though in the last post, we only reviewed the proportion of video formats that were in use, a better analysis would have looked at the proportion of video formats that the users have been using in the past few versions.
Over a period of time, such analysis would, in most cases, reveal trends that would be useful for the designers of the product, as well as the product managers to know. Till you would have done such an analysis, the way for you to learn such data would be by looking at industry data as well as research done in the forms of surveys and information from users through other means, but all of this is indirect data. Analysis of analytics data allows you to get confirmation on such information, and can help you make decisions that is also backed by hard data. 
A possible example that shows the value of such analysis would be about the usage of mobile devices. So, the product manager would have seen that there is a higher trend worldwide about using mobile devices for capturing such data, and then look at analysis of the data from the past many versions that talks about the source of the videos being used by the consumers of the applications. Consider a case where the trend shows a movement towards mobile devices being the source, but the trend is slow, only going up from around 12 % to 16% in the past 2 years. The question in front of the product manager was about diverting resources to producing a mobile version of the application, but that would require a large amount of code change and architectural and design work, and hence would have an impact on the current release. The other option would be to plan for a mobile version only in the next release, which would have a lower impact on the current release. Based on this data, the Product Manager might decide that although there is a lot of attractiveness in terms of having a mobile version, the data does not suggest that there is an emergent need to create a mobile version in the current release, especially with the costs of doing so. Instead, one can wait for such a release.
Taking such a decision is critical for the application, but being able to take such a decision without having data on consumer usage would mean a decision that is more like a guess-estimate, where some information tells that you can you take a decision, but the amount of data that you have is not adequate to produce a high level of comfort in taking such a decision.

We will continue this series on the usage of Analytics in the next post on this series.


Monday, May 20, 2013

Analytics - Measuring data relating to user information - Part 2

In the first part of this series (measuring data related to users - Part 1), I started out by outlining more details about what analytics is, what is the kind of information that can be captured from users, some kind of information that should not be captured based on privacy guidelines, and what you can do with this kind of information. In this post, I will continue more on this line and provide some more examples of what can be done (the purpose of this series of posts is to describe more about what can be done with analytics through some real life examples that lets you know what to do through analytics).
Let us take the example that we use a lot, a greeting card application that allows the user to use their own photos or images in addition to standard greeting card background photos, allows them to use their own audio and videos, or lets them record them same from the camera and microphone on their computers, and also allows them to add their own text of greetings. The final collected greeting can be sent via email, or through social networks.
Now, the application designers are trying to figure out tweak related to the videos that users upload from their own machines. There are numerous video formats that users can be having with them, since there are many different capture devices. You could be shooting the small video clip using a mobile phone (that too can have a different video format depending on the manufactures of the mobile phone), you could have shot the video using a tablet, could have shot it using a still camera, or have shot it using a video camera.
The size of the video that has been shot depends on the shooting device, depends on whether the user has reduced the resolution of the video in order to reduce the size, or in some cases, decided to use the same video that would have been uploaded to Youtube (which means that the video would have been converted to a FLV video format). Now, for most of these formats, these videos cannot be used just like that. Coders / decoders need to be user for this purpose, and even though there are some open source solutions, there would be commercial software that could be used for this purpose.
The decision about whether to use open source or commercial software could depend on the number of users who would be using such a software. So, as part of the data gathering in previous versions of the software, it could be determined as to which are the video formats that users are using, and then based on this data collection, the proportion of video formats used can be determined. If it turns out that the number of users using a particular format is more than a certain proportion, the product team would determine that it would make sense to use the commercial video encoder/decoder rather than use the open source one. The advantage that users get out of a better software component be greater than the cost advantage of using an open source software. But unless you are getting such information through analytics, any decision you take would be flawed, based on a hunch rather than information.

Read the next post in this series (Measuring data related to user information - Part 3)


Analytics - Measuring data relating to user information - Part 1

For any desktop product, capturing data relating to the computer systems of their user base is very important. For those who are not conversant with the idea of analytics or the necessity of capturing such information, it would make sense to ensure that analytics forms a part of their overall product strategy; but before that, it is necessary for them to understand what is analytics and why it should play a role in the overall strategy of a product team.
As always, I will try to use layman terms in this post rather than use technical jargon. So, analytics is very simple - it means capturing information about their consumers (for example, this could mean that you capture information about the number of times the customer has launched the application, you can capture information about the processors of the customer machine, the Operating system version, whether they are using Windows or Mac, and so on). Of course, there are privacy laws that are in place, and you need to ensure that you are not capturing information that allows your customers to be identified (for example, if you are capturing information about the folder in which the user is storing data, this may in turn be storing information in a folder that captures the user name; further, most advocates of privacy laws would be very hesitant about capturing the serial number of the user).
Capturing of such information is possible through the use of code and functionality, which captures events during the user interaction with the software application (such as the number of times that the user has launched a dialog window, the times that the user keeps a particular functionality open, the user workflow during the process flow, and so on). One prime use that I have seen is about capturing the number of times a dialog shut down improperly (crashed) and the same for the number of time the software application crashed as well.
Once the designers of the application decide on the information that needs to be captured, code has to be written for passing the same information through the internet to a tracking mechanism on the website of the software maker. Now, it needs to be kept in mind that this information is sent to a database to the proper tables, and further, keep in mind that such information can be very voluminous. This may be a few KB's of information for every user, but when you start dealing with thousands of users, or even millions of customers (such as for large application such as Microsoft Office and Adobe Reader), the amount of data can be huge and the database needs to be ready to handle such data.
In addition to the design of the database of capturing the information, the next step needs to be related to processing of this captured information. So, even though you have captured a large amount of data, there needs to be effort put in for analysing such data. The ideal set of people for analyzing such data is typically the people working on relevant features. So, suppose the data analysis for a particular feature needs to happen, the analysis should be done by the same team that is working on the feature, since the team has the expertise to figure out what the information is, and also what to do with the data.

Read more about this in the next post - Measuring data related to user information -  Part 2


Thursday, February 28, 2013

How to determine the Operating System support for your product - Part 8

In this series of posts, I have been talking about the Operating System support provided in your applications. In the previous post (Operating System support for your software application - Part 7), I talked about support for 32 bit vs. 64 bit and also talked about how support provided by components can be a huge factor in the operating systems that you support. I took the example of a video application which depends on a large number of external components for codecs, encoders and decoders, for writing to DVD and Blu-Ray and for other parts of the application. If some of these components drop support for an operating system and that component is a critical part of the application, then it is time to take the decision to drop an Operating System. I know of a number of software applications that finally dropped support for Windows XP because 1) Microsoft is on its way to dropping support for Windows XP, with this support ending in 2014. 2) A number of external components dropped support for Windows XP and these were critical enough that the management team of the application finally bit the bullet and dropped support for Windows XP as well.
What happens when you cannot afford to drop support for an application such as where components have dropped support, but the customer profile is such that there is still revenue to be had from customers on this operating system ? Well, that is not a very nice place to be, but you still need to take a stand. If the revenue is important, then you will need to support that specific operating system. So what do you do ? There are a number of steps that you can take to ensure that your product remains on that operating system.
1. Well, you will need to make another effort to ensure that the external component retains support for that specific operating system. If the company or group providing the external component is not willing to provide full support, ask whether they are willing to maintain it to the level that was previously supported. If this is another group in the company, then the revenue potential provides some leverage to ensure that escalation can happen and support is maintained, even if it is at a lower level (only critical bug fixes are provided rather than all bug fixes).
2. The most risky approach. You take a chance and go with a component that is not supported by the provider on that specific operating system. The problem in this case would be that if there is some critical problem that has emerged, things can go out of control very easily, and lead to a situation where there are no good options.
3. Look for alternatives. There are very few functionality items that would not have multiple providers, even if the alternative is a less than perfect functionality. If using another component provides a solution, then you should evaluate the other component and use it if it meets your purpose (even if less than perfect).
4. Prepare for a reduced functionality. I have seen many products using such an approach. When there are no alternatives, and it is decided that support for the specific operating system needs to be continued, then it may be something as easy as dropping the component which has dropped support for the operating system, and having the product without the functionality provided by that component. This needs to communicated to customers as well so that they know that there will be reduced functionality on that specific operating system.


Wednesday, February 27, 2013

How to determine the Operating System support for your product - Part 7

In the previous post in this series (Operating System support for your software application - Part 6), I focused on the different roles and responsibility of stakeholders in the team, primarily the Product Management, the QE, and the developers. The Product Manager has to take the decision taking into account the various pros and cons of such a move, and also while evaluating revenue impact; the QE and development team would do their contribution to this discussion keeping in mind the impact on their effort, and any technical factors that could also influence the decision. In this post, I will talk more about the dependency and also briefly touch on the 32 bit or 64 bit discussion.
For some years now, there has been an ongoing discussion about the need to move applications onto a 64 bit architecture and stop support for 64 bit architecture. Most people will not understand this discussion, and the reason why it is a top item of discussion for many teams. In near layman's terms, when you state that your application is now 64 bit, it would mean that it can take inherent advantage of the benefits posed by the new wave of 64 bit operating systems, being able to allocate more memory, and numerous other technical advantages. Also, most Operating Systems that are now available are 64 bit. So why not go ahead and convert your application to 64 bit ? Well, converting your codebase to offer native 64 bit support is a project by itself, requiring a large amount of development and testing time. For teams that have limited resources, making such choices is not easy (and most teams cannot claim to have unlimited resources). In addition, you also need to realize that you would no longer be properly supporting consumers who still have 32 bit operating systems (and where the hardware has been supporting 64 bit for a long time now), so this is a decision that needs to be taken.
The other aspect of this post is in terms of the various components that your application would use. In today's world of software development, it is hard to think of a large software that the development team has totally written. Consider that your product is a large video manipulation application. Even though a lot of the workflows will be written by your team, the functionality of a number of sub-areas are better handled by using external components (which could be built by other teams within the company, or by other companies which specialize in such areas). For example, if you are looking at an application that allows users to organize, edit and manipulate videos, you would need support for the different video formats, you would need access to different encoders and decoders, you would need components for creating DVD's or Blue Ray discs as part of the end process. In all such cases, it is far more efficient and effective to use specialized software rather than trying to replicate all of them.
And this is where you dependency starts to dictate matters for you in terms of the operating system support. The external components that you use are created by companies that in turn have to take the same decision for operating system support as you do, and they would also have a large number of customers, based on whom they need to take decisions. It is entirely possible that you would end up in a scenario where some key component that you are using is dropping support for an operating system, and given its criticality in your own application, you are forced to also stop support for the same operating system.


Tuesday, February 26, 2013

How to determine the Operating System support for your product - Part 6

This blog has seen a series of posts on deciding the Operating System supported by your application. The previous post (Operating Systems supported by your application - Part 5) talked about the kind of constrains that there are for an operating system that is not supported - whether these prevent the user from installing the application on that operating system, or just give a warning and let the user install on the specific operating system. This post will talk in more detail about the process for the various stakeholders in the team that come to a decision about the operating systems to support.
The most important stakeholder is the Product Manager. It is the product manager who is responsible for the final state of the product, the system requirements for the product (which includes the operating systems to be supported in the application). The Product Manager is also the one who is responsible for the revenue requirements for the product, and supporting or dropping an operating system can make a difference to the revenue generated by a product by a few percentage (and these few percentage can make a huge difference in terms of whether targets are met or missed). Hence, it is for the Product Manager to take a final call on whether the product should drop a specific Operating System or not. However, it is perfectly fine for team members to be able to provide a lot of updates and constraints to the product manager.
Another important stakeholder is the QE team (the testers). During the testing phase, the team needs to draw up a plan of which are all the operating systems that need to be supported, need to decide the amount of effort to be spent in each operating system, and then actually put in the effort. Suppose a team supports Windows XP, Windows Vista, Windows 7 and Windows 8. In such a case, the team would use data about the approximate number of users on each operating system in order to prioritize the testing effort on each operating system (no testing team has enough resourcing to do all the tests and spend the effort that they would like to do). But you would still expect that if there are 4 operating systems, then the team would spend around atleast 15% on each operating system testing. In some cases, the testing for an operating system might take more time because there are more defects to be found on such an operating system (for example, we found a lot more problems on Vista, many of them related to the security issues because of the user security accesses control introduced in Vista).
So, the testing team would want that if there is a possibility of reducing an operating system because of a lesser number of users on such a system, then they would hold a number of discussions with the Product Manager on this topic, to ensure that their voice is heard and if they have any data, that is also passed onto the Product Manager (such data could be the increasing number of bugs that they are finding on older operating systems).
The development team are also important stakeholders. They are responsible for ensuring that the code is there for functionality to work the same on all the supported operating systems, something that can be problematic sometimes because the operating systems behave differently. In addition, there may be components in use that are not supported as well on older operating systems.
All these are stakeholders and opinions that need to be factored in before taking a decision on whether to drop a specific operating system. The decision needs to be taken after factoring in a lot of points.

In the next post, I will add more points on this particular topic (Operating Systems support for your application - Part 7)


Monday, February 25, 2013

How to determine the Operating System support for your product - Part 5

This particular series of posts talks about how to determine the support for previous Operating Systems provided in your application (Operating System Support in your application - Part 4). In the previous post, I talked about one major factor - when the maker of the Operating System (whether that be Microsoft or Apple) decides to drop support for the Operating System and will not provide any more bug fixes or other support. This gives a problem where even if you decide to support such an application, you will not get any bug fixes from the makers of the Operating System, which can be a huge potential problem given the interactions of the application with the Operating System.
In this post, I will talk about the process of cutting off support for an Operating System. There are 2 different methods which I have seen about how to cut off support for an Operating System. One of the ways is to provide a hard constraint, which means that the user will not be able to install on that specific Operating System, and the other is a soft constraint which means that the user is given a warning when trying to install on that version of the Operating System.
Consider the variation about using a hard constraint that prevents the user from installing on such an Operating System. What this means is that when the user tries to load the application on that specific version of the Operating System, the application determines the specific Operating System loaded on the computer, and then checks with the supported list of Operating Systems. If the Operating System is not to be supported, then the application installer will give an error to the user and prevent any installation on the user machine. 
Putting a hard constraint is needed when the makers of the software have made a determination that the user should be prevented on that Operating System. This can be when there is a high deal of uncertainty about whether the application will work well on that Operating System without any defects, or when the makers have decided that the version of the Operating System is not in wide user anymore. The hard constraints are also used when the Operating System installed on the user machines is controlled, such as in the case of higher end or specialized software.
The soft constraint means that the user will get a message during the installation process about the version of the Operating System not being supported, and will get an option about whether to proceed or not. If the user decides to go ahead, then the application will get installed. This is normally done when there is an expectation of very few problems on that specific Operating System, and the company does not really want to force the users of that specific Operating System to try alternative solutions. There will be need to be some testing of that specific Operating System, but not at the same level as that of the supported Operating Systems.


Saturday, February 23, 2013

How to determine the Operating System support for your product - Part 4

In the previous post of this series (Determining Operating System Support for your application - Part 3), I wrote about the process of determining the number of people in your customer base who are using the Operating System in question. There are ways to do surveys and look at industry data, but there is some amount of variability involved even when analysing the data and some amount of assumptions need to be made. Of course, trying to make such decisions without trying your best case on how to get the data required for such analysis is something that organizations should avoid at all costs. Such decisions could cost money that the organization could ill afford, and hence such decision making should be done with a lot of deliberation.
In this post, let us consider another factor that is of great importance in deciding when to drop support for an Operating System from your application. This is related to the drop in support for a particular Operating System by the makers of the operating system. So, if you could consider the case of an operating system such as Windows NT or Win 2000, the support for all of these have been dropped by Microsoft, and if you were to try to get resolution for a problem on these operating systems with Microsoft, they would decline to provide you any support and ask you to upgrade to the newest operating system.
Now you are developing an application that will run on the operating system. Any application, especially those that accesses files on the local machines or that accesses devices on the local machines such as printers (and most applications give a print interface) have a dependency on the files of the Operating System. From time to time, there are problems that crop up where you need to work with the makers of the Operating System (typically Microsoft or Apple) and even expect them to make some fixes for you. When the makers of the Operating System withdraw support, they stop supporting such problems and no longer want to provide fixes for such problems.
So what do you do ? You could still provide support for the Operating System even when the maker of the Operating System is no longer providing any support, but there is an inherent risk in this decision. During the development process, you could run into a problem that could cripple your system and yet you don't have a solution, or a solution on your end is expensive and time consuming  In such cases, it will cause you significant problems; on the other hand, small problems that really are not problems could be all that are caused. And you have to consider that the maker of the operating system would also have thought a lot about dropping support, and there would have been some factors that went into such a decision.
Apple makes it even easier. As and when Apple releases new operating systems, new machines that are released are packaged with these new systems, and they even stop supporting older operating systems on these machines. Deciding on dropping older versions of the Mac is easier than that of Windows, also because the customer base using the Mac Operating System would be less than that of Windows.

Read the next post in this series (Operating System support for your application - Part 5)


Friday, February 22, 2013

How to determine the Operating System support for your product - Part 3

In the past post in this series related to how to determine the Operating Systems support for your application (Operating System Support - Part 2), I talked about trying to determine what your current users support. If your current users are using an Operating System in fairly significant numbers, then it would be hard for you to drop support for such an Operating System; you would risk turning off these users, preventing them from buying the newer version of your software, and drop a significant potential customer base. But, this decision remains a complicated one.
In the previous post, I talked about getting this data from current users. But it is not only current users from whom data needs to be taken. To ensure that you are not dropping potential users from your customer base, you need to do surveys to determine whether your potential users are having an Operating System that you are planning to drop. Consider the case of Windows XP at this point of time. It is an old Operating System and you would might think that most users have moved on from Windows XP. But suppose you decided to drop support of this Operating System from your newer version and then suddenly see a drop in sales, it would seem obvious that you did not do the required survey of your potential users before making such a significant decision.
So what needs to be done ? Well, this is a survey of your potential customers, and is not as easy as just getting data from existing users. However, at the same time, there are many ways to get data related to Operating System usage. At any time, if you just do a search for Operating System usage, you will find that there are a number of articles where surveys have been done to determine the prevalence of Operating Systems among people worldwide, in specific age groups, across geographies and so on. So, if you have a product whose potential customer base is primarily among people above the age of 40 in the United States, there will be some industry data related to such a customer base. Of course, the data will not be exactly in the way that you need, so you need to account for the assumptions, look at the variables and then do an analysis to determine the data appropriate for you and use that as an input for your decision making.
However, it is not just looking at industry data. Another input for decision making is to call for a survey that looks at a sample of your user base and then provides that data to you along with a factor that determines the error assumptions in the survey. The advantage of this method is that you can determine the exact assumptions to be made in the survey, the queries to ask, and then get the results. However, to get accurate results and to ensure that you are not making inaccuracies, the survey needs to be thought through properly, and this can be expensive. But the science of surveys is pretty much standard, and you can be fairly confident about the results.
Now you have got the data required for your decision making, and you need to add variables regarding whether people who have older operating systems will actually buy your software, since people who are comfortable with their current software and operating systems are likely to have a lower percentage chance of buying newer software, even if it is very useful.

More information in the next post in this series (Deciding Operating System Support for your software - Part 4)


Wednesday, February 20, 2013

How to determine the Operating System support for your product - Part 2

In the part 1 of their series (Determine the Operating System Support for an application - Part 1), I started with the discussion about the various Operating Systems that product teams could support, and some of the complications that come about in the decision making for deciding the Operating System support. In this post, I will talk about some of the other issues that help decide what the operating system support should be.
One simple factor that determines what older versions of Operating Systems you should support include determining the customer impact if you drop a version. In today's day and age, you would be hard-pressed to find a user who has Windows 95 or Windows NT on their machines or an equivalent older version of the Mac OS. Even if you did find somebody like that, the number of people who are actually on such Operating Systems would be very small, and you should be able to afford such users. The tricky park comes with Operating Systems that are closer to the current version, such as Windows XP. Now Windows XP has had 3 newer versions of the Windows Operating Systems that have been available after that, namely Windows Vista, Windows 7, and Windows 8. But, a number of people (especially older people) do not make their software upgrade decisions based on whether a newer version is available. If a newer version is available, and if their existing version provides the functionality that they are comfortable with, you will find a significant % of people will not upgrade their Operating Systems or their machines. I personally know numerous people who heard bad stuff about Windows Vista, decided that their Windows XP installation was fine, and refused to upgrade. If your expected customer base has people who are like this, then it would be foolhardy to do an upgrade unless you are sure about your figures. And this is where things get tricky. You have to do data reporting and data analysis to get enough information to take a decision.
Now even inferring from the data analysis may not be fairly straight-forward. How do you decide the data collection technique ? You would want to know from 2 sources - one would be the current users of the application, and the other would be potential users. None of these are easy methods.
Suppose you want to get this data from current users ? The ideal way would have been to have a mechanism within the application that connects on a regular basis to your computers and provides some information about the computer of the user, including the Operating System and Service Pack version of the software application. If you were getting this information from all your users, then it is very easy to consolidate this information and quickly determine how many of your users are on the Operating System on which there is a query about whether you should continue to support it. Suppose you have set a benchmark of 10% being the limit above which you will continue to support an Operating System, then this data will easily help you make a decision. At the same time, keep in mind that building such a system in the application is not easy. In this day and age of security and privacy considerations, you would need to ensure that such a mechanism passes legal and privacy guidelines. Further, keep in mind that some of your users may not be connected to the internet on a regular basis and hence the data you get is only from those people who are connected to the internet. As always, when you get data, you need to make sure that this data is an accurate as you can verify. The method used for data collection, the logic used for the data interpretation both need to be accurate and verified, else you will commit the grave mistake of taking a decision based on either wrong data or wrong analysis of data, both of which can cause grave problems.

This is it for this post, will continue in this series in the next (Determine Operating System Support - Part 3).


Friday, March 2, 2012

What are different interpreting data defects?

A software system or application can perform an assigned task only when it is capable of interpreting or analyzing the data. The proper analysis or interpretation of data is very much necessary for the proper execution of the task.
If the data interpretation itself is wrong, then you cannot expect the accurate results.

STEPS INVOLVED IN INTERPRETATION OF DATA
The interpretation of data involves the following steps:
- Inspection of data
- Cleaning of data
- Transformation of data
- Modeling of data

These steps are responsible for producing only the meaningful data with conclusions and any other supportive decisions. There are many approaches and facets of the data interpretation.

DATA INTERPRETING TECHNIQUES
- Data analysis employs various different data interpretation techniques in different domains.
- Following are some data interpreting techniques:
1. Data mining
These techniques are focused up on the modeling of data as well as descriptive purposes.

2. Business intelligence
- This interpreting technique is suitable for heavy data bases where a lot of aggregation work is required.
- This basically used in the business domain.

3. Statistical Analysis
Further comprises of two techniques namely exploratory data analysis (EDA, discovers new features), descriptive analysis and CDA or confirmatory data analysis (responsible for proving the existing hypotheses wrong).

4. Predictive Analytic
It is employed for predictive forecasting.

5. Text Analytic
It is used for extraction and classification of the data from various sources.

CATEGORIES OF DATA TYPES
Different data types employ different interpreting techniques. The data is classified in to the following categories:

1. Qualitative Data
Data denotes the presence or absence of a particular characteristic (passes/ fail).
2. Quantitative Data
Data is numerical either a continuous decimal number to a specified range or a whole counting number.
3. Categorical Data
Data from several different or similar categories.

DATA INTERPRETATION/ANALYSIS PROCESS
The interpretation or analysis of data is not a simple process and indeed involves complex processes. And complex processes are very much prone to defects and errors.
- In a data interpretation process, defects can exist in every phase.
- Let us start from the first step of the process and discuss the defects as we move down in the process.
- Data cleaning involves the removal of erroneous data.
- If the program performing the task of data cleaning itself is diagnosed with some defect, then it can let in some erroneous data which in turn can cause many defects in the whole process.
- The changes made in data should be retrievable and should be documented.
- It is recommended that the data to be analyzed should be quality checked as soon as possible since the defective data is the cause of many defects in the interpretation process.
- There are several ways of checking the quality like:
# Descriptive statistics
# Normality
# Associations
# Frequency counts

- In some cases the values of data might be missing.
- This can also cause the whole interpretation process to hang up or falter or it can also come to a halt.
- In such a case the missing data can be imputed.
- Defects can occur if the data is not uniformly distributed.
- To determine this randomization procedure should be checked for its success.
- If you have not included a randomization procedure, you can use a non sampling randomization procedure.

SOME DATA DISTORTIONS
There are some possible data distortions that also give rise to data interpreting defects:
1. Item Non Response
The data should be analyzed for this factor in the initial stage of the data analysis itself. The presence of randomization does not matter here.
2. Drop Out
Like item non response, the data is to be analyzed for this also in the beginning itself.
3. Quality Treatment
The bad quality of the data should be treated with various manipulation checks.


Facebook activity