Friday, September 13, 2013
Saturday, September 7, 2013
Big data – a journey 90 years back
Big data – a journey 90 years back
Some weeks
ago, when I was in a hotel in South East Asia just about getting ready to
check-out, the TV was still on and the movie J. Edgar, directed by Clint
Eastwood was being shown. Leonardo DiCaprio stars in the move as J. Edgar
Hoover, the powerful head of the FBI for nearly 50 years. There is a scene
pretty early in the movie when J. Edgar Hoover is fighting for more competences
for the Bureau, which so far has not been granted any teeth. All his passion
goes into fighting and solving crimes. Maybe the most important weapon in his
view is central access to data. For instance, he fought for a centralization of
nationwide available fingerprints.
Just a few
scenes after the hearing, you can see J. Edgar Hoover in the FBI premises when
huge loads of finger print files arrive. This was just a first step and shortly
thereafter an identification unit was formed. This all happened in as early as
1924 – almost 90 years ago.
To me this is
an early example of big data. No internet, no satellite communication, no
computers, not even copy machines as we know them today existed at that time.
Can you imagine the amount of data in physical paper that was collected with
the fingerprint files in just one location? However, this kind of data volume would
probably fit on just one server blade of today’s computing centers. We can
consider this as one of the early examples of big data, however, not in any way
comparable with the data amounts collected nowadays – whether by corporations
about their customers or state authority based on whatever motive. By current
standards this kind of data collection would not even be considered “small
data”…
The next
problem after the collection of the data, is analytics and data mining. There
was no electronic index and you can find the right data just with three key
words or two mouse clicks. A sophisticated filing system had to be in place to
enable agents to work with the fingerprints. Frankly speaking, for a non-expert
like me, it sounds incredible how they worked on comparing finger prints from a
crime scene with file data in their archives.
Within a
short period of time from some feared bureaucratic organization the FBI has been
transformed into a highly respected agency based on this new approach of
fighting crime. Data collection and development of new techniques in crime
scene investigation represented key factors for the success of the agency.
Today, the FBI employs more than 36,000 special agents and support
professionals, overseeing a budget of more than 8 billion US dollars. Besides
its US operations, the FBI now runs field offices around the world.
Watch the
trailer of “J. Edgar” (2011) on IMDB: http://www.imdb.com/title/tt1616195/?ref_=fn_al_tt_1
Thursday, August 29, 2013
How our future cars will be like?
The present day car is full of sensors, controllers and
electronic control units. The user interface includes, for instance, instrument
panel, on-board diagnostics, user controls and settings and in car
entertainment systems. Once you are inside the car you are in a different world.
What once felt more like a couch, which was accidentally equipped with a steering
wheel and pedals, now looks more like an airplane cockpit. Everything is interconnected
in our world of devices and gadgets, the car seems to have been unfortunately
left out. The only time when detailed vehicle data can be accessed is when it
visits the workshop.
This presents a unique opportunity for manufacturers to
start connecting car to the internet. This will enable capturing real-time data
on car usage and performance. The Big data can be mined to provide insights
into cars performance and customer behaviors. Manufacturers may find the costs
of connecting the cars prohibitive, but the benefits do make a strong argument
in favor. Some premium manufacturers are already offering online connection
packages as extra for their cars.
Connected cars: Real-time access
of vehicle data can revolutionize the mobility for passengers. The experience
of moving from one place to another can be improved in a connected car. The
cars performance can be customized based on the route chosen to travel. The suspension,
the power, the steering, the transmission, the tyre etc. can be optimized based
on real-time data of the location of the car, weather and the type of road. The
goal of such optimization can be to improve comfort, performance or fuel
economy.
The user driving habits can be mapped on real-time and
necessary advice on improving the driving can be suggested. This driving data
can be analyzed to predict life of various components or time to go for repair.
It can be used to predict failures and provide alerts on possible breakdowns.
Cars can be connected to share information to each other
while on road. This communication between cars can share information like if
brakes are being pressed, if a driver has fallen asleep, if car tyres are
slipping or skidding or if the cars are too close to each other. Data from
other cars can be used to predict behavior of other cars or drivers on the
road.
Customized cars:
Such real-time data can be used for analysis to find trends and insights into
driver preferences and habits. This data on the interaction between the driver
and the car can be used to design and optimize various features and components
of the car. A feature seldom used may be removed and a component used more
frequently may be improved for performance and reliability.
Past history of driving habits and preferences of a customer
could be used to develop a car customized to match person’s needs. The data on
habits may also help, for instance, to optimize gasoline consumption by
choosing the right engine as well as personalized engine management. Such a
customized and optimized car would deliver more value for customer and also
lead to lower costs to manufacturer but eliminating features or equipment not
desired by the customer.
Safety: Historical
data on driving habits can be used to predict accidents, areas prone to
accidents, conditions leading to accidents or possible errors a user can make
while driving. Real-time information on road conditions, weather and accident
history can also significantly improve road safety. Driving will become a lot
safer. Car insurance rates could be more accurately determined with driving
behavior and history of the customer. Another benefit of safer road travel will
make the host of safety equipment like anti-lock braking systems, electronic brake
force distribution, electronic stability programs and airbags etc
may become redundant, making cars a lot cheaper.
Know more:
The locations were the car is parked can provide further insight into customer’s
life. This data can be a gold mine for marketers. This information can be used
to position various kinds of products like hotels, holiday trips, shopping
malls, lifestyle products or services etc. Some may find it intrusive to their
privacy. But location data can be used to enhance driving experience. Customers
profile and historical preferences can be used to suggest thing that a customer
may find interesting on route.
Manufacturers who collect and have therefore access to such
data will be in a superior position compared to those who don’t. The intimate
knowledge about its customers with the help of big data can help manufacturers
to build a stronger relationship with customers by delivering value desired by
the customer and exceeding customer expectations by accurately predicting their
behavior.
Friday, August 23, 2013
Big Data = Bigger Health Inequality?
We’ve already established that big data is a
game changer. This is as true for business as it is for our daily living –
including our individual and public health.
Big data is already being used in
health research, preventative health care, treatment and therapy. For example,
new websites are popping up that allow patients to track their symptoms and
outcomes in real time. This allows others in similar medical situations to
analyze the individual and aggregate outcomes to help make their medical
decisions. Some of these websites, like patients
like me, even show an expected outcome for your case
for each potential intervention based on the available data on the website
(watch the website creator explain it here). A January 2013 McKinsey report entitled “The
“big-data” revolution in health care” describes the impact of big data and how
it is helping and can continue to help lead to right living, right care, right
provider, right value and right innovation.
The May 2011 McKinsey report Big
data: The next frontier for innovation, competition, and productivity estimates that “big data can enable more than $300
billion a year in value creation in US health care”. As exciting as the potential of big data is on public health, I
can’t help but ask if big data represents the next big thing in health inequality?
It has been long established that health is
strongly correlated with socioeconomic status (just check out gapminder if you aren’t convinced). Low and middle-income countries bear a
disproportionate mortality and morbidity burden. In addition, within a country,
poor, vulnerable and marginalized groups have inferior health and health
outcomes.
Traditionally, these are the groups who
receive the lowest level of health care coverage and fewest research dollars –
currently less than 10 percent of medical research is devoted to
diseases that account for more than 90 percent of the global burden of disease. Unfortunately, they are also the groups who generate the least
data and, consequently, where big data will have the least impact.
If big data is going to revolutionize our
world as quickly and as profoundly as predicted, and if “primary data pools are
at the heart of big-data revolution in healthcare” (McKinsey, 2013 report), it
is these already underserved and overburdened groups who will benefit the
least. This risks creating an even greater health inequality.
What’s the solution? We need to work to
generate a robust data reserve for underserved groups and start using it. Here
are some simple steps that we can take to get there:
Go digital: Many of the
information systems used in global public health don’t give us big data because
they don’t use digital data collection. Many large and resource intensive sociodemographic
and epidemiological surveys are still conducted using paper and are never fully
entered into a digital format. New technologies and tools should make these
collection methods a thing of the past. Joel Selanik explains the problem and presents
one great solution in his TED talk (here).
Ask for help…from non-experts: a research team from the Harvard School of Public Health used 1,000 non-scientific volunteers to analyze an enormous data
set of tuberculosis bacterium growth videos. The team of volunteers was able to
analyze the information in two days, which would have normally taken the
research team three months. We need to start unleash the power of crowd
sourcing solutions.
Open your mind to open data:
The culture of traditional research has created incentives for groups to
protect their data. This has to be a way of the past. Key players like the
United Nations, including the World Health Organization, have already made
their data sets public. All organizations, countries and researchers should
follow suit and allow the public access to non-personal health information so
that evidence based public health can be crowed sourced for all diseases across
various contexts.
Paper Data Collection on Health in Rural India
Sunday, August 18, 2013
Big Data - Why you pay more than your neighbor on a plane...
Next to
consumer goods and financial institutions the travel industry sits on one of
the largest data repositories regarding customer information and number of
transactions. In addition, it is one of the most flexible and complex
industries since prices and in parts capacity can be adjusted on a real-time
basis to balance supply and demand. Moreover, we all know that marginal costs
for an empty room or seat on an airplane are low for the company and that an
unsold spot is a forgone revenue opportunity.
Key to manage
these complexity and to exploit all revenue opportunities is therefore to have
up-to-date information systems that not only present isolated information about
an individual customer, but rather integrate information across all important
data sources and allow to derive superior insights and strategies. Potential
data sources in the travel industry could be market share and pricing
information of competitors, capacity information as well as current and
historic transaction information. By combining all this information valuable
insights can be created and concrete recommendations and trends being derived
automatically by the system to maximize revenue. Having such systems in place
radically changes the needed skill profile of your employees though, which
becomes significantly more analytical and shifts to making tactical and
strategic decisions rather than executing rule-based pricing. Companies need to
adjust for these changes and train their employees accordingly as well as
giving them the needed level of freedom and responsibility to act in such an
environment where static rules do not work. The Pricing Manager eventually
becomes a trader that takes calculated risks on a daily basis.
In addition, all the information helps the
companies to better understand their customers and to create new product
offerings that differentiate them from competition and help to win market
share. An example for this can be hourly pricing based on check-in and
check-out times, but also product bundles that increase the convenience of
booking for the customer. By analyzing the booking pattern of customers
companies can even try to derive the willingness to pay of certain customers
and start customer-specific pricing strategies throughout the booking process,
which is quite powerful in combination with loyalty programs that provide
additional transparency about the customer’s travel behaviors.
Big Data is therefore extremely valuable for the entire travel industry and companies should be on their toes to early adopt and take advantage of this new field of data analytics.
Friday, August 9, 2013
The future of HR: Big Data
As we
mentioned in our first post, Big Data is “everything we do”, it is in our daily
life wether we want it or not. Although some people may think it is only on
Facebook, Google, YouTube, or Amazon, Big Data is much more that that and can
be used in many areas of a company such as Marketing, Sales, HR and
R&D.
In this post
we will focus on some of the relations between BD and Human Resources.
Today’s world makes recruiters compete for superior talent while the company’s
top management asks for more input to make faster and better recruiting
decisions. Big Data can improve the company’s selection process, by helping us
to better hire, understanding the market and filtering among hundreds of CV’s
to select candidates that will be the right fit for our organization.
Moreover, if
we focus on employees, companies employ hundreds of people and over the years
they have created databases with all the demographic and performance
information, education, age, marital status, among many other factors. This
information can be use to predict metrics for the organization and to make
better “people” decisions in advance and most importantly to predict
organizational performance.
The success
for Big Data and HR involves combining and analyzing metrics as performance,
previous positions, and salaries, among others to provide more accurate and
smarter solutions to business problems that face the organization.
For more
information on Human Resources and Big Data:
http://talentmgt.com/articles/view/hr-can-t-ignore-big-data/1
Monday, August 5, 2013
Guess what?! Big data "knows" who you will vote for in next elections!
Do
politicians build strong platforms or do they just follow Big Data to win in
the elections?
Elections:
The process that involves thousands of people and takes massive character every
few years across all elected governments on earth. Elections are the time when
huge campaigns run at each distinct part of the countries and when consultants
prepare massive polls and analyze loads of information to identify and target
millions of voters.
The US Politicians learned fast to adopt Big Data and now
they apply it to the attitudes and preferences of the population to “understand
why people are voting for them or why they’re not, and that has the effect of
hopefully being able to change policy in a more meaningful and democratic way”.
The 2012 US elections displayed how Big Data could be used for turning gigantic
campaign data into detailed practical information. Data analyst Nate Silver
became a celebrity when he managed to predict the results in each of the 50
states accurately.
Few
contraindications for the application of Big Data in elections exist though. In
the 1948 elections, the polls (Big Data back then) predicted a Thomas Dewey
victory over Harry Truman. That election marked the first time pollsters relied
on telephone surveys, giving them access to more voters. It turned out that a
lot of Truman supporters didn't have phones. The real results turned out to be
otherwise. Or bringing parallels to nowadays, when huge campaigns and platforms
are built to count the polls of voters on Facebook and other social platforms we
must consider that “the elderly woman in Philadelphia, who doesn't have a photo
ID, also probably doesn't tweet much or otherwise contribute to the 15
terabytes of new information on Facebook every day”. This example shows that
Big Data can be very helpful in our everyday life and that no one can escape
from it, but analysts need to keep their critical mind to not blindly fall into
the data gap.
Subscribe to:
Posts (Atom)