Big Data
 

MegaSack DRAW - This year's winner is user - rgwb
We will be in touch

[Closed] Big Data

125 Posts
40 Users
0 Reactions
212 Views
Posts: 91097
Free Member
Topic starter
 

Anyone working with this concept?

Just curious to know what people are up to.

NB I'm not talking about obesity statistics.


 
Posted : 20/10/2014 5:31 pm
Posts: 30656
Free Member
 

Timely...

http://www.bbc.co.uk/programmes/p028cm6q


 
Posted : 20/10/2014 5:35 pm
Posts: 0
Free Member
 

Interesting listen. The sources of most big data are not statistically representative, they have bias. Especially if you think of data from markets with much lower tech use or from consumer groups that are too exclusive. There are no shortcuts to gathering and generating statistically significant data. It is an expensive process, even if using a small enough sample size for it to statistically stand up.


 
Posted : 20/10/2014 6:11 pm
Posts: 0
Free Member
 

pfft just another phrase for the merry go round of IT issues that go on circles About every 10 years

if it didn't have a new name they couldn't sell new Courses


 
Posted : 20/10/2014 6:11 pm
 GJP
Posts: 0
Free Member
 

I am starting to take an interest from a CRM perspective. Spent most of the weekend googling Hadoop and casandra stuff. As business architect I found it tough going. Not doing anything actively yet.


 
Posted : 20/10/2014 6:15 pm
Posts: 363
Free Member
 

Yep, not a big fan of the phrase as it is just marketing crap.

Very few sets of data meet the 3 Vs required (Velocity, volume and variety). The term is now just used to describe what Business Intelligence and data analytics should have been all along, although most people didn't really do this,


 
Posted : 20/10/2014 6:20 pm
Posts: 23296
Free Member
 

been looking at kaggle.com recently with a half baked view to a change in career. matlab nerd here but thinking about learning R or SciPy to give it a go.


 
Posted : 20/10/2014 7:04 pm
Posts: 0
Free Member
 

Not doing anything actively yet.

Pretty much sums up the industry at the moment. Follow @BigdataBorat for a funny angle on big data.


 
Posted : 20/10/2014 7:07 pm
Posts: 27
Free Member
 

Yep, not a big fan of the phrase as it is just marketing crap.

^^this.
it's the new "digital". it's quickly becoming an overused term; companies using it to sound forward-thinking and innovative.


 
Posted : 20/10/2014 7:08 pm
Posts: 0
Free Member
 

Big Data is like teenage sex.

Everyone thinks everyone else but them is doing it. In truth, only a very few people are doing it.

If you are a business it's only worth looking at once you're already operating at peak/close to peak efficiency and if you have deep pockets.

The cost of building the infrastructure to host it, train your people to use it and the time it will take for it to be useful is very high. You may also wind up finding out you're doing pretty well, in which case you'll be confirming what you know anyway!


 
Posted : 20/10/2014 7:31 pm
Posts: 4954
Free Member
 

back2basics - Member
pfft just another phrase for the merry go round of IT issues that go on circles About every 10 years

if it didn't have a new name they couldn't sell new Courses

Bang on, its te most annoying thing about IT the repackaging of everything into a (maybe) slightly easier to use API and claiming its all new.

Its good that some tool kits for parallel extraction of data have been built so you don't have to role your own but the way people are talking like it is the first use of parallel computing on commodity hardware. Gets right on my tits!


 
Posted : 20/10/2014 7:49 pm
Posts: 4954
Free Member
 

Big Data is like teenage sex.

Drink and drug fueled?


 
Posted : 20/10/2014 7:50 pm
Posts: 91097
Free Member
Topic starter
 

I like that description 🙂

From my point of view, as a consultant, it remindes me of that Tom Sharpe book. Riotous Assembly or the other one, Indecent Exposure. In it, the local chief of police is paranoid about communist terrorists so he creates a group of secret agents to pose as communists and infiltrate the cell. In order to protect their identities, he doesn't tell them who else is in the group. So off they go hanging around in bars and acting like they expect communist terrorists to act and looking out for other people acting like they expect communist terrorists to act. There aren't in fact any terrorists so all that ends up happening is they form their own terrorist cell. To maintain credibility with each other they start organising actual terrorist attacks, which simply hardens the resolve of the chief of police..

I'm being asked to look into new technologies such as 'mobile' and 'big data'. I'm happy to mess with this from a technology point of view but we'll just have to see how much business it generates 🙂

I should add though that my colleage did do a prototype using Hadoop for large batch processing. Of course the same thing could have easily been done via traditional means.


 
Posted : 20/10/2014 7:59 pm
Posts: 534
Free Member
 

I work in insurance. We build models (nerd alert) based on 10s of millions of customer records in order to price risks, predict competitor's rates and predict the price elasticity of individual customers. To me it feels like Big Data, but we have been doing it since way before anyone started talking about big data. This leaves me thoroughly confused as to whether it is cutting edge or way behind the times. I have no idea what people are actually doing with Big Data in other industries.

I agree that Big Data is a fairly stupid phrase though. I have heard it compared to the International Conference on Very Large Databases, which has been running annually since 1975. Their definition of "very large" much have changed quite a lot over that time.


 
Posted : 20/10/2014 8:06 pm
Posts: 91097
Free Member
Topic starter
 

The difference between Big Data and simply lots of data is how it's stored. You can store a lot in a relational database but that's peanuts compared to what big data techniques can do. It's meant to be infintely scalable, so you can simply keep on adding processing nodes to process more and more data.


 
Posted : 20/10/2014 8:16 pm
Posts: 4954
Free Member
 

Nothing is infinitely scalable the bottleneck is just moved.


 
Posted : 20/10/2014 8:32 pm
Posts: 401
Free Member
 

I am but I don't call it Big Data as that is just ****.

Here is something [url= http://phased.co.uk/london-bike-theft-weather-map/ ]bike related[/url]


 
Posted : 20/10/2014 8:40 pm
Posts: 0
Free Member
 

Yep, we do a lot of analysis based around the stock markets - looking for insider dealings and so on. Just moved over to Exadata and the results aren't quite what we thought they'd be considering the costs.

Whilst very much a marketing term that's currently all the rage, many sites (world pay, visa, Vodafone) have been doing something very similar for years. I think it's probably very similar to SOA in that if you have the perfect set of parameters combined with the right infrastructure and the demand from the business then it's probably the bees bendy bits. But most places wouldn't really fit 'the model'. a nice theory, but in practise it doesn't really work, certainly not for us. Hadoop books currently propping up my bike stand, about the most useful they've been...


 
Posted : 20/10/2014 8:53 pm
Posts: 91097
Free Member
Topic starter
 

How is crime data 'big'? Are there millions of crimes per second? It's worse than I thought!

flange - you're right, big data techniques are quite specific and much of what it is probably being used for could be done with something else.

I wrote a PoC creating energy forecasts with around 100k data points, and I parallelised it onto many nodes but that wasn't done with Hadoop (although it would've worked fine) and it wasn't big data.

Having said that, Hadoop is a convenient framework for chugging through data - my code did essentially the same thing but Hadoop's already written.


 
Posted : 20/10/2014 9:05 pm
Posts: 0
Free Member
 

Yup I work in it. We never call it big data that's just a nonsense term. Is it useful to be able to analyse a single large data set instead of multiple smaller ones? Yes, no, sometimes. People are sold the tech but not the outcome and that's when it fails. If you're trying to achieve a specific outcome then it can be very powerful for example linking customers and billions of transactions like the tesco clubcard database.


 
Posted : 20/10/2014 9:07 pm
Posts: 91097
Free Member
Topic starter
 

Interesting you call it a nonsense term - it's not to me. If you don't see the difference I'm wondering how big your data sets are.


 
Posted : 20/10/2014 9:08 pm
Posts: 0
Free Member
 

I've got plenty of colleagues doing it (in academia). I do some social network analysis stuff but it isn't big as n=10-15k


 
Posted : 20/10/2014 9:09 pm
Posts: 0
Free Member
 

None of our consultants ever use it and we work with some of the biggest datasets in the country. I take my steer from them. I feel its nonsense if it has multiple defintions that no one agrees upon. Is 6 billion rows "big data" enough or are we talking at crossed purposes?


 
Posted : 20/10/2014 9:11 pm
Posts: 0
Free Member
 

The stuff I'm working on is a anywhere from 1 million - 30 million data points, but it's not 'big data'. Big Data to me is data that's huge, bigger than standard relational databases can process easily. Data that numbers thousands or millions of points per second. Tweets, facebook graph interactions, credit card transactions etc It's a bit wooly to be honest and like other tech buzz words it's meaning will eventually become to have a fairly standard definition, but it might take a while.


 
Posted : 20/10/2014 9:17 pm
Posts: 0
Free Member
 

Which is why I hate the term it's a buzz word thats been knocking about for ages but isn't specific enough to mean much. It sells though which is why terradata,oracle etc make billions from telling people to collect it all like a hoarder filling their house with old newspapers.
Some scientific uses like weather and climate modelling are justified in using the term as they are working on huge datasets with thousands of variables but even there an actual defintion is hard to come by.

Still, you know, pays the bills.


 
Posted : 20/10/2014 9:24 pm
Posts: 91097
Free Member
Topic starter
 

6bn rows.. not that big 🙂

When I say big data I mean nosql databases, map/reduce and all that crap. Or whatnobeer said - that's the definition most people use. I wouldn't use it on a project, because it's a high level term.

Anyway, splitting hairs. I briefly considered installing a hypervisor and some VMs on my very old desktop as a hadoop cluster, but then it's single core so that would be a bit silly 🙂


 
Posted : 20/10/2014 9:26 pm
Posts: 0
Free Member
 

There's the rub I meet people all the time who've been misinformed that 5million customers with 100million transactions is "big data" and they need to spend a gazillion pounds to make any sense of it, it's annoying. Then they get sold some tech, a bit of software, are given a run book and told their problems are solved.


 
Posted : 20/10/2014 9:30 pm
Posts: 91097
Free Member
Topic starter
 

Just downloaded page view stats from wikipedia. I thought it was a good fit for hbase but I am now thinking it's not, should probably just be one big hdfs file maybe.

More reading tomorrow.


 
Posted : 20/10/2014 9:41 pm
Posts: 401
Free Member
 

How is crime data 'big'?

It's not, it's all the other stuff we are aggregating along with it to analyse that is.


 
Posted : 20/10/2014 9:50 pm
Posts: 17
Free Member
 

we had a go at predicting the level of calls to a Bushfire advice line based on weather, ground conditions and heaps of other variables. Probably at the small end of big data. It was enough to put us off.

I saw a couple of decent presentations on it a few years back from IBM I think.


 
Posted : 20/10/2014 11:07 pm
Posts: 91097
Free Member
Topic starter
 

Mike that sounds more like analytics.


 
Posted : 21/10/2014 7:10 am
Posts: 17
Free Member
 

It was close to one end of it, if we had gone much further it would have headed into the big space


 
Posted : 21/10/2014 7:18 am
Posts: 0
Free Member
 

Interesting/amusing article about it on The Reg

Data Mining, noun: "Torturing data until it confesses ... and if you torture it enough, it will confess to anything."

[url= http://www.theregister.co.uk/2014/10/20/sanity_now_ending_the_madness_of_data_completism/ ]http://www.theregister.co.uk/2014/10/20/sanity_now_ending_the_madness_of_data_completism/[/url]


 
Posted : 21/10/2014 7:44 am
Posts: 12079
Full Member
 

Enjoyed that, cheers Aidan.

Not sure what Big Data is, but if there're employment opportunities it's surely worth looking into...


 
Posted : 21/10/2014 7:54 am
Posts: 0
Full Member
 

When I say big data I mean nosql databases, map/reduce and all that crap

Thats how I see it. Locally I've got nothing that can't be processed perfectly well in relational databases, as long as the queries are good and the indices right. And then just chuck a ton of virtual hardware at it.

Elsewhere we have some secret squirrel stuff using it but it's certainly not breaking into our BI / normal LoB world and I don't think it will as we don't have the problems it's best palced to solve.


 
Posted : 21/10/2014 7:56 am
Posts: 0
Free Member
 

i spend a lot of time (ok, some) getting rid of data - we're drowning in the stuff.

we've got laser scanners, and GOM scanners, Alicona, and etc. which are very good at generating massive point clouds, which we then have to 'thin' - often chucking away 50% or more.

(you only need 3 points to define a radius, there's no point having 5000, especially when that curve is an agreed non-critical feature)


 
Posted : 21/10/2014 8:02 am
Posts: 8839
Free Member
 

I work for a worldwide data company and Big Data is something that gets thrown around regularly but as someone has mentioned already, it seems a little vague as to what it is and what to do with it.

The way I see it, its the stuff thats out there that isn't really easily categorised. So we deal with all sorts of data - financial, some personal etc. Big Data to me is the stuff you can't easily pigeonhole so mainly the social media stuff. Trawling social media and pulling out useful stuff from that for example. What that 'stuff' is though is anyones guess.

As an aside, I've recently started developing with a new ETL tool that has apps to link into Hadoop etc. I'd be interested to see how that works considering it kind of grounds to a halt if you feed a 10MB Excel file into it given the limitations of the Java code behind it :S


 
Posted : 21/10/2014 8:14 am
Posts: 91097
Free Member
Topic starter
 

I just spent a week for an extremely well known and important client tweaking their Java based core system so it doens't grind to a halt. Drop me a line I can probably help you out 🙂


 
Posted : 21/10/2014 8:21 am
Posts: 3293
Full Member
 

Everyone wants to do it yet they don't know what it is.

Just add 'big data' and 'cloud' to a power point slide and away you go.

I have done some work with Hadoop and did dome data processing waaaaaay quicker than we've ever done with SQL (and that's with some real SQL nerds). However, the type of processing lent itself well to map/reduce so it was a good fit. We certainly have data volume with as many events as you want per second. Can't really give any more details than that though.


 
Posted : 21/10/2014 8:36 am
Posts: 13817
Full Member
 

I've got 1300 odd customer records to cleanse - that's Big Data to me! 😀


 
Posted : 21/10/2014 8:45 am
Posts: 363
Free Member
 

The value of "big data" seems to come from this concept of looking at data to find an answer to a question you never knew. That is what I try to get the data analysts (or data scientists in new marketing speak) to do on a daily basis and they have been for years.
But, most of the people who talk about this do not include the statistical verification to ensure that the data they are looking at is relevant.

All the map reduce, hadoop etc are just tools that will allow you to work with data in different ways. Sometimes they will be beneficial, sometimes not depending on each projects requirements. I don't agree with using the term big data just because you use hadoop.

It also means that more companies are put off investing in simple data analytics as the costs of implementing big data, and the fact that there is no calculable ROI with big data makes it unattractive.

For most companies, if they have basic data warehousing, collation of data and get an data analyst to provide relevant insights into their business, they will benefit massively.


 
Posted : 21/10/2014 8:59 am
Posts: 91097
Free Member
Topic starter
 

From a technology point of view (since I am a techie) the value of the tools is being able to store process datasets that would have been way too big to fit into a traditional SQL DB, and would have been dismissed. That's basically it.


 
Posted : 21/10/2014 9:05 am
Posts: 3293
Full Member
 

Relational DBs are quite happy to store humungous amounts of data. As long as you are doing straight forward CRUD, or else at least have your schema optimised for your queries, and the way it's scaled ties in with all of this, and you can pay for whatever enterprise version of software you need, then everything is fine.

IME the tricky bit comes when the queries are not simple CRUD, are more complex, and you don't even know what they are going to be up front. Maybe they cannot make use of indexing because they need to touch a large number of rows, or maybe the query engine is not clever enough to scale your query the way you scaled your schema. That is when less structured technologies that are designed with scaling in mind _might_ be useful.

So it's not just the amount of data, it's what you want to do with it.


 
Posted : 21/10/2014 9:41 am
Posts: 8839
Free Member
 

I just spent a week for an extremely well known and important client tweaking their Java based core system so it doens't grind to a halt. Drop me a line I can probably help you out

Its an out the box application. Doesn't seem to like big excel files. Convert to tab delimited text inputs and everything is fine 🙂


 
Posted : 21/10/2014 9:43 am
Posts: 91097
Free Member
Topic starter
 

You will still be able to change the heap parameters on a 3rd party app, I'd have thought. That'll be your problem for sure. If you want, you can PM me the name of the app and I can have a look.


 
Posted : 21/10/2014 10:37 am
 dazh
Posts: 13302
Full Member
 

Ooh a 'big data' discussion. I'm hoping it's more informed than the ones we have here at a very large and well known engineering consultancy where every man and his dog thinks they know about it, and thinks they're using it. I'm still struggling to explain to some engineers that storing relational data in a number of separate excel spreadsheets isn't necessarily a good idea. I haven't even bothered trying to correct them on what they think big data is.


 
Posted : 21/10/2014 10:58 am
Posts: 8839
Free Member
 

You will still be able to change the heap parameters on a 3rd party app, I'd have thought. That'll be your problem for sure. If you want, you can PM me the name of the app and I can have a look.

Done that. Same. Its the (excuse my Java terminology!) module that the application uses to read Excel files. Apparently theres a more efficient one which reads them as XML which we're using to convert to text instead. Happy days. If it wasn't for luddites supplying in Excel files we'd be fine!


 
Posted : 21/10/2014 11:01 am
Posts: 91097
Free Member
Topic starter
 

You probably need to increase the nursery pool size on the heap 🙂


 
Posted : 21/10/2014 11:03 am
Posts: 1
Free Member
 

I think it's interesting that people not using any form of big data have no idea what it is but feel able to give a black and white opinion on it. We use big data techniques to reduce our huge daily data collection into something more manageable so 30Gb ends up nearer 6Gb (yes I mean gigabytes BTW) which we couldn't do without our largish hadoop cluster.

The investment was, and is, very large but the dat we get is vital to our business so it's worth the cost etc.


 
Posted : 21/10/2014 11:09 am
Posts: 0
Full Member
 

so 30Gb ends up nearer 6Gb

I could do that with 7zip for you 😀


 
Posted : 21/10/2014 11:12 am
Posts: 363
Free Member
 

Well, by most definitions I've read, you're not doing big data, just using hadoop as an ETL tool.

Now you obviously do something with that data that could include analytics that require the 3 or 4 V's, and you could be doing all sorts of cool statistical analysis on the data, but big data is more than ETL.

Great tools if they work in your environment.


 
Posted : 21/10/2014 11:21 am
Posts: 91097
Free Member
Topic starter
 

Well Hadoop is a tool that is very useful for bigdata so it comes under that name, but yeah there is no such thing as 'doing' big data, just using bigdata tools and techniques.

To me it's a name for the tools, techniques and challenges - and what you do with them may include petabytes of data, it may not.


 
Posted : 21/10/2014 11:33 am
Posts: 0
Free Member
 

For those seeking a definition, this is from a new academic journal called [url= http://bds.sagepub.com/content/1/1/2053951714528481.full ]Big Data and Society[/url]:

Kitchin (2013) details that Big Data is:

- huge in volume, consisting of terabytes or petabytes of data;

- high in velocity, being created in or near real-time;

- diverse in variety, being structured and unstructured in nature;

- exhaustive in scope, striving to capture entire populations or systems (n?=?all);

- fine-grained in resolution and uniquely indexical in identification;

- relational in nature, containing common fields that enable the conjoining of different data sets;

- flexible, holding the traits of extensionality (can add new fields easily) and scaleability (can expand in size rapidly)

I take issue with the second point and argue the data doesn't need to be created in real time.


 
Posted : 21/10/2014 11:38 am
Posts: 13594
Free Member
 

Big data must be more than GBs of data, eg our radio networks generate Gbs of data every week, I happily munge it all in VBA.....

I generally reduce Gbs of data into a few Kb of KML and visualise stuff in Google Earth. More GIS than Big data though.


 
Posted : 21/10/2014 11:41 am
Posts: 91097
Free Member
Topic starter
 

Nope - afaik IBM have different product lines for processing it in real time and for dealing with it when it's been stored.


 
Posted : 21/10/2014 11:45 am
Posts: 1
Free Member
 

@Shred I can't describe exactly what we do with our data but we don't just push 30GB to 6GB daily and sit on it! We do lots of further statistical analysis that drives decision making in a business turning over £45 million plus.

@CaptJon We cover a few of those!


 
Posted : 21/10/2014 11:59 am
Posts: 3
Free Member
 

I work for a company that sells "Big Data" stuff. FWIW a few points that we tend to make, and I am not technical so don't know any of the geekery

It's really four "V's". Volume, velocity, variety, value" although though that's really consultant bulls**t. There are lots of industries that generate enough data to make capturing and analyzing it a challenge such as:
Financial Services: all trading across commodities, stocks, currency etc
Utilities: data coming every 30 mins from Smart meters (or will do)
Utilities and oil companies: SCADA data being produced every 0.1 secs by tens of thousands of sensors
Security services: monitoring emails, mobiles and social media

Relational databases are very good at holding vast amounts of data but there are two challenges:
1) Cost of storage is high when you take into account the hardware, software, management, support, etc. A Hadoop environment is much cheaper for storing large amounts of rapidly changing data
2) You need to do a lot of data modeling to get stuff into a relational database and it needs to have consistent structure which is not the case in the examples above. Relational is hard to get data into, but easy to get it out, whereas Hadoop is the opposite as you can just dump anything in

So if you have huge amounts of different types of data arriving at high velocity and you want to extract value from it, you can dump it into a Big Data environment, do some analysis, throw away anything that's irrelevant and move stuff you want to keep into your higher cost relational environment.


 
Posted : 21/10/2014 12:36 pm
 Kit
Posts: 24
Free Member
 

There's an afternoon of talks about "Data Science" on 3rd November in Ednburgh, if anyone's interested. Unis deal with huge datasets on a regular basis 🙂

http://www.eventbrite.co.uk/e/launch-of-the-epsrc-centre-for-doctoral-training-in-data-science-registration-13325592205


 
Posted : 21/10/2014 2:13 pm
Posts: 363
Free Member
 

A problem I also have is finding the right people who can properly analyse and draw conclusions from the data. I'm battling to hire the right people, so I often wonder who is doing the analytics on all of these projects, and how do they know that the results they are getting are actually relevant.


 
Posted : 21/10/2014 2:24 pm
Posts: 91097
Free Member
Topic starter
 

I'll do it.. what're you paying? 🙂


 
Posted : 21/10/2014 2:57 pm
Posts: 13594
Free Member
 

I'm battling to hire the right people, so I often wonder who is doing the analytics on all of these projects, and how do they know that the results they are getting are actually relevant.

That's where the money is. You can hire a coder for peanuts from India / China, but finding people who actually understand what's going on is like finding rocking horse shit, hence pays very well.


 
Posted : 21/10/2014 3:11 pm
Posts: 0
Free Member
 

A lot of people who think they are doing business intelligence aren't. Much the same a lot of folk who think they're doing big data, aren't. Big data isn't 30gb, it's tb and pb.

There's an article in one of the London papers tonight about how big data growth is being limited by a lack of expertise, so clearly there is a market for it. I only caught the headline but if that's the case it'll either become very niche or die out altogether.

As an example, We process submissions in both xml and xbrl. Xml is straightforward enough, xbrl was supposedly the new super improved way of submitting what is essentially the same data. As a banking standard everyone was told they should adopt it. So far there are about three people in the uk who understand it and can make sense of it. Great when every financial institution should be submitting in it. i give it two years before the eba revert back to xml.


 
Posted : 21/10/2014 4:52 pm
Posts: 0
Free Member
 

Lots of xbrl in use in the Aus Superstream system, and Schemtron validation xslts, which take an age to run.


 
Posted : 21/10/2014 5:29 pm
Posts: 363
Free Member
 

You've already said you are a techie, where I m looking for analysts.

I would like someone with a stats background, has experience in analytics, data mining and understands the business. Not easy as either people are not business types, or not stats types.


 
Posted : 21/10/2014 5:39 pm
Posts: 0
Free Member
 

isn't it the same old IT problem though, companies suddenly realise they need X skill for Y project but wont hire 30 year experience tech programmer Z to train him/her into X skill, instead they pay megabucks contact wage instead and pull them from some other company
so that company then decides never to train any permanent staff again and does what the above company does and hires contractors so demand fuels salary


 
Posted : 21/10/2014 5:49 pm
Posts: 0
Free Member
 

isn't it the same old IT problem though, companies suddenly realise they need X skill for Y project but wont hire 30 year experience tech programmer Z to train him/her into X skill, instead they pay megabucks contact wage instead and pull them from some other company
so that company then decides never to train any permanent staff again and does what the above company does and hires contractors so demand fuels salary


 
Posted : 21/10/2014 5:49 pm
Posts: 0
Free Member
 

two pages in and not one of these..

[url= ]BIG DATA[/url]

[img] [/img]


 
Posted : 21/10/2014 6:13 pm
Posts: 0
Free Member
 

We have some use for it but few people who can interpret it or know how to use it. Won't spend any more money until people get smarter or we get smarter people


 
Posted : 21/10/2014 7:25 pm
Posts: 91097
Free Member
Topic starter
 

You've already said you are a techie, where I m looking for analysts.

I'll do it anyway. How hard can it be? 😆


 
Posted : 21/10/2014 7:45 pm
Posts: 396
Free Member
 

Big Data or just Data Mapping?

[url] http://www.technologyreview.com/news/530296/cell-phone-data-might-help-predict-ebolas-spread/ [/url]


 
Posted : 21/10/2014 9:10 pm
Posts: 401
Free Member
 

Lots of xbrl in use in the Aus Superstream system

"Makes sign of the cross and furiously eats garlic"


 
Posted : 21/10/2014 9:52 pm
Posts: 91097
Free Member
Topic starter
 

Big Data or just Data Mapping?

Rather depends on how much data there is. If you have to use big data techniques because of the size of the data set then it's big data.


 
Posted : 21/10/2014 10:05 pm
Posts: 363
Free Member
 

back2basics, it's not just a case of retraining. People that understand data are few and far between. Most of my work is fighting with the dev teams about data accuracy and problems they have introduced in the data.

Even most DBAs do not get data and another large part of my job is explaining to people why close enough is not actually good enough.

It is hugely frustrating when I really think it is very easy to put in some basic controls to ensure data accuracy and testing to ensure dev changes do not cause major problems, but again, most devs and DBAs just don't see it.


 
Posted : 21/10/2014 11:04 pm
Posts: 0
Free Member
 

+1 for Shred. I run the Data & Analytics practice of a Big Four consultancy and the mainstay of my work is to get clients to think about the outcomes they want from their data. Better insight is the important element here rather than the routes taken to reveal an outcome. MapReduce et al is just another tool in the bag. Without a purpose the tool is null and void.


 
Posted : 22/10/2014 6:44 am
Posts: 0
Free Member
 

Better insight is the important element here rather than the routes taken to reveal an outcome. MapReduce et al is just another tool in the bag. Without a purpose the tool is null and void.

but the toolset is important as otherwise you wouldn't be able to feasibly process the data in the timescales required to make the insight acquired of much use as it would be too far out of date.


 
Posted : 22/10/2014 7:46 am
Posts: 23296
Free Member
 

People that understand data are few and far between.

just out of interest, what sort of money are we talking about for people that do get 'data'...


 
Posted : 22/10/2014 8:06 am
Posts: 91097
Free Member
Topic starter
 

MapReduce et al is just another tool in the bag. Without a purpose the tool is null and void.

Quite right, and that's interesting that the same term means different things to techies and analysts. To me it's just the tools, but of course not to you 🙂

Toolset is important of course - without the tools we wouldn't be having the discussion as it would be too difficult/expensive to even attempt.


 
Posted : 22/10/2014 8:08 am
Posts: 12079
Full Member
 

just out of interest, what sort of money are we talking about for people that do get 'data'...

... and how do you get into it? I'm starting to get bored of Java EE (after 15 years of the stuff), a change would be nice...


 
Posted : 22/10/2014 10:13 am
Posts: 0
Free Member
 

I'm starting to get bored of Java EE

who wouldn't...


 
Posted : 22/10/2014 11:05 am
Posts: 0
Free Member
 

Contracting money for a decent 'known' name in big data is anywhere from £850 pd upwards. I had a pure techie in recently who was £1240 a day but he was pretty good. Useless on the Analysis side of things though.

Like anything though, just doing the courses won't 'get you into it'. I certainly wouldn't employ someone fresh off a course or on the basis that they'd built some heath robinson thing at home. Perhaps the best way is to be employed on a project that uses your current skillsets but also involves the technology you want to move into. Hence the issue with lack of people with skills - you can't get experience without experience.

I'm assuming that you're not heavily involved with BI stuff at the moment? I'd view a foundation in BI as the way to move into something like Big Data. If you have a Java background, some of the end user BI tools use Java and a lot of smaller firms like custom development. Certainly Cognos Report studio is quite lucrative if you can do clever stuff that the standard tool can't do (god knows why though - it makes upgrading a nightmare). The arse has fallen out of the Cognos market though, with contract rates being pretty low (£300'ish a day for a report writer, if you can find a role). Maybe look at Tableau and Qlikview as other possible toolsets that everyone currently wants. Once into reporting, move into the ETL side of things and bob is your mothers brother..


 
Posted : 22/10/2014 12:12 pm
Page 1 / 2