Subscribe by Email


Thursday, May 23, 2013

Analytics - Measuring data relating to user information - Part 4

This is a series of posts that focus on analysis and its importance in today's world. In the previous post (Measuring data relating to user information - Part 3), I talked about using the data from previous versions of the software to determine trends, and to use those trends to make major business decisions, and also use this data to complement trends seen from other sources of information.
In today's world, analytics can play a major role. I still remember an article (read the article) which described the power of analysis in today's world, and how it can ever surprise non-industry people. The same article also highlighted the power and problems posed by Analytics in terms of privacy problems. If you read this article, you would be whacked by what data analysis can reveal, and also be shocked to some degree about what data reveals and whether you are comfortable with this kind of information about you being deducted.
And this is the main content of this post. If you are starting to collect data about your customers from within your software, you need to be sure that you are not overdoing it. Once you start designing what all data you are going to collect, you need to ensure that there are hard lines that set the boundaries for the data you are collecting. If you are using a component, you need to ensure that the component respects the same kind of data privacy constraints that you are using.
This might seem like going too excessive, but keep in mind that privacy is a big deal. Most software development teams are not equipped to determine as to what is proper or not. This was brought painfully clear to me when we let the development team design the analytics capturing process, including all the information that was supposed to be captured, and then, when we met the legal team, they junked more than 25% of the data that we wanted to capture. We did not like it, but there are certain boundaries that are required. The opposite is not worth talking about. You could go beyond the privacy guidelines and even implement them in your product and release them, and then there is a chance that somebody detects that you are capturing some information that is deemed as personal, and you are stuck. So, it is always recommended that once you get into the area of capturing information for analytics, make sure that it has been confirmed by somebody who is a privacy expert, which could be a legal person, or could be somebody else.
There are even more worries, especially when it comes to your product selling across geographies. A country or region may have different privacy standards when it comes to collecting information from the user's machine (for example, the European Union has far tighter guidelines when it comes to privacy) and you need to ensure that you are not falling foul of one region by following the standards of another. For example, a privacy guideline of a region would be to insist that the users know about the information that it to be collected, as well as have given their permissions for the same.
But, this is not to say that you should start getting scared about capturing user information. If you have a verified system of guidelines and are following those, you should not be worried about this. Make sure that you are doing your best to capture the relevant information, and you can learn far more about your users preference than you expected, which will help you make the best decisions about your product.


Wednesday, May 22, 2013

What are Address Binding, Dynamic Loading and Dynamic Linking?

In this article we shall discuss about three interrelated concepts namely address binding, dynamic loading and dynamic linking.

1. Address Binding: 
- There are two types of addresses for the computer memory. 
- These are called the physical address and the logical address. 
- A physical memory location is allocated to a logical pointer by address binding process.
- This is actually nothing but associating the physical address and the logical address with each other. 
- Sometimes logical address is also referred to as the virtual address. 
- This concept is an important part of the memory management. 
- Operating system is responsible for carrying out address binding on behalf of the applications and programs that need an access to the memory. 
- A program cannot be executed without bringing it to the main memory. 
- The instructions of the program have to be bound to right address spaces in the physical memory. 
- Address binding is simply a scheme for performing this job. 
- It can be thought of as something similar to address mapping. 
- Address binding can be carried out at any of the following times:
Ø  Compile time
Ø  Loading time
Ø  Execution time

- In execution time binding, whenever the program requires access to memory, it has to go through a register called the relocation register and is similar to the base register. 
- Then the offset is added. 
- But in binding during the loading time, same thing is done but every time this register need not be evaluated. 
- The addresses are mapped at the time of loading the program in to the memory. 
- If there is a change in the base address, the whole program has to be reloaded.

2. Dynamic Loading: 
- This mechanism is very useful for a program as it helps it do the following things:
Ø  Loading library in to the main memory.
Ø  Retrieving the address of the variables and routines that are contained in the library.
Ø  Accessing those variables and executing those routines.
Ø  Unloading the library.
- Dynamic loading is very much different from the load time linking and static linking. 
- Dynamic loading allows a system to start up even of the libraries are absent. - It also helps in discovering the absent libraries and then gaining the additional functionality. 
- Dynamic loading is a very transparent process since it is the operating system that handles it. 
- Main advantages are firstly, it helps in fixing the patches at once without having the need for re-linking them and secondly, it provides protection to the libraries against modification that is not authorized. 
Dynamic loading find its major use in the implementation of the software plugins.
- It is also used in the implementation of the computer programs where requisite functionality is supplied by the different libraries and user has the freedom to select the libraries he/ she wishes to provide.

3. Dynamic Linking: 
- This is an important part of the binding process. 
- The purpose of the dynamic linking is resolving the references or symbols and links to the library modules. 
- This process is carried out by a linker program. 
- This programs searches for a set of library modules in some given sequence. 
This process takes place during the creation of the executable file. 
- The resolved references may be addresses of the jump calls and the routines. - These may in different modules or in the main program.
- Dynamic linking resolves them in to relocatable address or fixed address through allocation of the memory to each of the memory segment of the referenced module. 


Analytics - Measuring data relating to user information - Part 3

This is a series dealing with analytics, and the advantages that it brings to the product team. However, any movement into the area of analytics requires a lot of careful thought, and needs time to be spent on the design of a strategy around Analytics. You cannot just say that you want analytics, and put in place some strategy. But when you do get your Analytics strategy right, there are a lot of benefits that are possible. In the previous post (Analytics - measuring data related to user information - Part 2), I talked about a scenario where a team wants to find out the video formats being used by its users in the application, and the benefits of making decisions based on this knowledge rather than making guesses (which may be right, or could be wrong). In this post, I will write more about the usage of analytics while making decisions.
Consider the previous post, which talked about which video formats are popular. However, consider that you need to make decisions about the future, which means that you need to do much more analysis about the data you are getting. So, if you have been in the game for many versions now, don't just look at data for the previous release. Instead, if you have been gathering such data for the past many releases, you need to make an effort to review this data for the past many releases in order to figure out the best possible method ahead. So, even though in the last post, we only reviewed the proportion of video formats that were in use, a better analysis would have looked at the proportion of video formats that the users have been using in the past few versions.
Over a period of time, such analysis would, in most cases, reveal trends that would be useful for the designers of the product, as well as the product managers to know. Till you would have done such an analysis, the way for you to learn such data would be by looking at industry data as well as research done in the forms of surveys and information from users through other means, but all of this is indirect data. Analysis of analytics data allows you to get confirmation on such information, and can help you make decisions that is also backed by hard data. 
A possible example that shows the value of such analysis would be about the usage of mobile devices. So, the product manager would have seen that there is a higher trend worldwide about using mobile devices for capturing such data, and then look at analysis of the data from the past many versions that talks about the source of the videos being used by the consumers of the applications. Consider a case where the trend shows a movement towards mobile devices being the source, but the trend is slow, only going up from around 12 % to 16% in the past 2 years. The question in front of the product manager was about diverting resources to producing a mobile version of the application, but that would require a large amount of code change and architectural and design work, and hence would have an impact on the current release. The other option would be to plan for a mobile version only in the next release, which would have a lower impact on the current release. Based on this data, the Product Manager might decide that although there is a lot of attractiveness in terms of having a mobile version, the data does not suggest that there is an emergent need to create a mobile version in the current release, especially with the costs of doing so. Instead, one can wait for such a release.
Taking such a decision is critical for the application, but being able to take such a decision without having data on consumer usage would mean a decision that is more like a guess-estimate, where some information tells that you can you take a decision, but the amount of data that you have is not adequate to produce a high level of comfort in taking such a decision.

We will continue this series on the usage of Analytics in the next post on this series.


Tuesday, May 21, 2013

Define the Virtual Memory technique?


Modern operating systems come with multitasking kernels. These multitasking kernels often run in to the problems related to memory management. Physical memory does not suffice for them to execute the tasks assigned to them because of being fragmented. So they have to take some additional from the secondary memory. But they cannot use this memory directly. Virtual memory offers a solution to this problem. 

What is Virtual Memory technique?

- Using this technique makes the fragmented main memory available to the kernels as a contiguous main memory. 
- Since it is really not the main memory but just appears to be, it has been named as the virtual memory and this technique is called the virtual memory technique. 
- Since, it helps in managing the memory, it is essential a memory management technique. 
- The main storage gets fragmented because of many programming and processing problems. 
- The main memory available to the processes and the tasks is virtualized by the virtual memory technique and then it appears to the process as a contiguous memory location. 
- This memory is a global address space. 
- Virtual address spaces such as these are managed by the operating system. 
- The real memory is assigned to the virtual memory by the operating system itself. 
- The virtual addresses of the allocated virtual address spaces are translated in to the physical addresses automatically by the CPU. 
- It achieves this with the help of some memory management hardware specially designed for this purpose. 
- The processes continue to execute uninterrupted as long as this hardware properly translates the virtual addresses in to real memory addresses properly. 
- If it fails in doing so at any point of time, the execution comes to a halt and the control is transferred to the operating system. 
- The duty of the operating system now is to move the requested memory page to the main memory from the backing store. 
- Once done with this, it then returns the control again to the process that was interrupted. 
- It greatly simplifies the whole execution process. 
- Even if the application would require more data or code that would fit in real memory, it does not have to be moved to and fro between the backing store and the real memory. 
- Furthermore, this technique also offers protection to the processes that are provided distinct address spaces by the isolation of the memory allocate to them from other tasks.
- Application programming has been made a lot easier with the help of the virtual memory technique since it hides the fragmentation defects of the real memory. 
- The burden of memory hierarchy management is delegated to the kernel which eliminates the need for the explicit handling of the overlays by the program. 
- Thus each process can execute in an address space that is dedicated to it. 
- The need for relocating the code of the program is obviated along with using relative addressing for accessing the memory. 
- The concept of virtual memory was generalized and eventually named as memory virtualization. 
- Gradually, the virtual memory has become an inseparable part of the architecture of the modern computers. 
- For implementing it, dedicated hardware support is absolutely necessary. 
- This hardware is built in to the CPU in some sort of memory management hardware. - If required for boosting the performance of the virtual memory, some virtual machines and emulators may employ some additional hardware support. 
- The older mainframe computers did not have any support for the virtual memory concept. 
- In virtual memory technique, each program can solely access the virtual memory.


Monday, May 20, 2013

Analytics - Measuring data relating to user information - Part 2

In the first part of this series (measuring data related to users - Part 1), I started out by outlining more details about what analytics is, what is the kind of information that can be captured from users, some kind of information that should not be captured based on privacy guidelines, and what you can do with this kind of information. In this post, I will continue more on this line and provide some more examples of what can be done (the purpose of this series of posts is to describe more about what can be done with analytics through some real life examples that lets you know what to do through analytics).
Let us take the example that we use a lot, a greeting card application that allows the user to use their own photos or images in addition to standard greeting card background photos, allows them to use their own audio and videos, or lets them record them same from the camera and microphone on their computers, and also allows them to add their own text of greetings. The final collected greeting can be sent via email, or through social networks.
Now, the application designers are trying to figure out tweak related to the videos that users upload from their own machines. There are numerous video formats that users can be having with them, since there are many different capture devices. You could be shooting the small video clip using a mobile phone (that too can have a different video format depending on the manufactures of the mobile phone), you could have shot the video using a tablet, could have shot it using a still camera, or have shot it using a video camera.
The size of the video that has been shot depends on the shooting device, depends on whether the user has reduced the resolution of the video in order to reduce the size, or in some cases, decided to use the same video that would have been uploaded to Youtube (which means that the video would have been converted to a FLV video format). Now, for most of these formats, these videos cannot be used just like that. Coders / decoders need to be user for this purpose, and even though there are some open source solutions, there would be commercial software that could be used for this purpose.
The decision about whether to use open source or commercial software could depend on the number of users who would be using such a software. So, as part of the data gathering in previous versions of the software, it could be determined as to which are the video formats that users are using, and then based on this data collection, the proportion of video formats used can be determined. If it turns out that the number of users using a particular format is more than a certain proportion, the product team would determine that it would make sense to use the commercial video encoder/decoder rather than use the open source one. The advantage that users get out of a better software component be greater than the cost advantage of using an open source software. But unless you are getting such information through analytics, any decision you take would be flawed, based on a hunch rather than information.

Read the next post in this series (Measuring data related to user information - Part 3)


Analytics - Measuring data relating to user information - Part 1

For any desktop product, capturing data relating to the computer systems of their user base is very important. For those who are not conversant with the idea of analytics or the necessity of capturing such information, it would make sense to ensure that analytics forms a part of their overall product strategy; but before that, it is necessary for them to understand what is analytics and why it should play a role in the overall strategy of a product team.
As always, I will try to use layman terms in this post rather than use technical jargon. So, analytics is very simple - it means capturing information about their consumers (for example, this could mean that you capture information about the number of times the customer has launched the application, you can capture information about the processors of the customer machine, the Operating system version, whether they are using Windows or Mac, and so on). Of course, there are privacy laws that are in place, and you need to ensure that you are not capturing information that allows your customers to be identified (for example, if you are capturing information about the folder in which the user is storing data, this may in turn be storing information in a folder that captures the user name; further, most advocates of privacy laws would be very hesitant about capturing the serial number of the user).
Capturing of such information is possible through the use of code and functionality, which captures events during the user interaction with the software application (such as the number of times that the user has launched a dialog window, the times that the user keeps a particular functionality open, the user workflow during the process flow, and so on). One prime use that I have seen is about capturing the number of times a dialog shut down improperly (crashed) and the same for the number of time the software application crashed as well.
Once the designers of the application decide on the information that needs to be captured, code has to be written for passing the same information through the internet to a tracking mechanism on the website of the software maker. Now, it needs to be kept in mind that this information is sent to a database to the proper tables, and further, keep in mind that such information can be very voluminous. This may be a few KB's of information for every user, but when you start dealing with thousands of users, or even millions of customers (such as for large application such as Microsoft Office and Adobe Reader), the amount of data can be huge and the database needs to be ready to handle such data.
In addition to the design of the database of capturing the information, the next step needs to be related to processing of this captured information. So, even though you have captured a large amount of data, there needs to be effort put in for analysing such data. The ideal set of people for analyzing such data is typically the people working on relevant features. So, suppose the data analysis for a particular feature needs to happen, the analysis should be done by the same team that is working on the feature, since the team has the expertise to figure out what the information is, and also what to do with the data.

Read more about this in the next post - Measuring data related to user information -  Part 2


Sunday, May 19, 2013

What are different types of schedulers and their workings?


Scheduling is an important part of the working of operating systems. 
- The scheduler is the component that provides access to the resources to the processes, threads and data flows. 
- These resources may include time of the processor and the communications bandwidth. 
- Scheduling is necessary for effectively balancing the load of the system and achieving the target of QoS or quality of service. 
- Scheduling is also necessary for the systems that do multitasking and multiplexing on a single processor since they need to divide the CPU time between many processes. 
- In multiplexing, it is required for timing the simultaneous transmission of the multiple flows.

Important things about Scheduler

There are 3 things which most concern the scheduler:
  1. Throughput
  2. Latency inclusive of the response time and the turnaround time
  3. Waiting time or the fairness time
- But when practically implemented, conflicts arise between these goals for example between latency and throughput. 
- It is the scheduler that can make a compromise between any two goals. 
Based on the user’s requirements and the objectives it is decided to which goal the preference has to be given. 
- In systems such as the embedded systems and robotics that operate in real time environment, it has to be ensured by the scheduler that the processes are capable of meeting the deadlines. 
- This is a very critical factor in maintaining the stability of the system. 
- The administrative back end is used for managing the scheduled tasks that are then sent to the mobile devices.  

Types of Schedulers

There are 3 different types of schedulers available which we discuss below:

Long term Schedulers or Admission Schedulers: 
- The purpose of this type of scheduler is to decide about the processes and jobs to be admitted or added to the ready queue. 
- When a program makes an attempt for executing a process, it is the responsibility of the long – term scheduler to delay or authorize the request for admitting the process to the ready queue. 
- Thus, what all processes will be executed by the system is dictated by this scheduler. 
- It also dictates about the degree of the concurrency and handling of the CPU intensive and I/O intensive processes. 
- Modern operating systems use this for making sure that there is enough time for the processes to finish of their tasks. 
- Modern GUIs would be of very less use if there was no real time scheduling. 
The long term queue resides in the secondary memory.

Medium term Schedulers: 
- This scheduler serves the purpose of removing the processes from the physical memory and placing them in the virtual memory and even vice versa. 
This process is called swapping out and swapping in. 
- A process that has been inactive for some time might be swapped by the scheduler. 
- It may also swap a process with frequent page faulting, low priority or more amount of memory etc. 
- This is necessary since this makes the space available for other processes.

Short term Schedulers: 
- These schedulers are more commonly known as the CPU schedulers.
- It decides which one out of all the processes will be executed after the clock interrupt, a system call, an I/O interrupt, hardware interrupt and so on. 
- Thus, we can say that the frequency of the short term schedulers of making decisions is much higher than that of the long term and medium term schedulers since after every time slice these schedulers have to decide.
There is one more component that is involved in CPU scheduling but is not counted under schedulers. It is called dispatcher. 


Facebook activity