Viewing 40 posts - 1 through 40 (of 126 total)
  • Big Data
  • molgrips
    Free Member

    Anyone working with this concept?

    Just curious to know what people are up to.

    NB I’m not talking about obesity statistics.

    Jamie
    Free Member
    TrailriderJim
    Free Member

    Interesting listen. The sources of most big data are not statistically representative, they have bias. Especially if you think of data from markets with much lower tech use or from consumer groups that are too exclusive. There are no shortcuts to gathering and generating statistically significant data. It is an expensive process, even if using a small enough sample size for it to statistically stand up.

    back2basics
    Free Member

    pfft just another phrase for the merry go round of IT issues that go on circles About every 10 years

    if it didn’t have a new name they couldn’t sell new Courses

    GJP
    Free Member

    I am starting to take an interest from a CRM perspective. Spent most of the weekend googling Hadoop and casandra stuff. As business architect I found it tough going. Not doing anything actively yet.

    Shred
    Free Member

    Yep, not a big fan of the phrase as it is just marketing crap.

    Very few sets of data meet the 3 Vs required (Velocity, volume and variety). The term is now just used to describe what Business Intelligence and data analytics should have been all along, although most people didn’t really do this,

    jam-bo
    Full Member

    been looking at kaggle.com recently with a half baked view to a change in career. matlab nerd here but thinking about learning R or SciPy to give it a go.

    Pembo
    Free Member

    Not doing anything actively yet.

    Pretty much sums up the industry at the moment. Follow @BigdataBorat for a funny angle on big data.

    brakes
    Free Member

    Yep, not a big fan of the phrase as it is just marketing crap.

    ^^this.
    it’s the new “digital”. it’s quickly becoming an overused term; companies using it to sound forward-thinking and innovative.

    curiousyellow
    Free Member

    Big Data is like teenage sex.

    Everyone thinks everyone else but them is doing it. In truth, only a very few people are doing it.

    If you are a business it’s only worth looking at once you’re already operating at peak/close to peak efficiency and if you have deep pockets.

    The cost of building the infrastructure to host it, train your people to use it and the time it will take for it to be useful is very high. You may also wind up finding out you’re doing pretty well, in which case you’ll be confirming what you know anyway!

    TheBrick
    Free Member

    back2basics – Member
    pfft just another phrase for the merry go round of IT issues that go on circles About every 10 years

    if it didn’t have a new name they couldn’t sell new Courses

    Bang on, its te most annoying thing about IT the repackaging of everything into a (maybe) slightly easier to use API and claiming its all new.

    Its good that some tool kits for parallel extraction of data have been built so you don’t have to role your own but the way people are talking like it is the first use of parallel computing on commodity hardware. Gets right on my tits!

    TheBrick
    Free Member

    Big Data is like teenage sex.

    Drink and drug fueled?

    molgrips
    Free Member

    I like that description 🙂

    From my point of view, as a consultant, it remindes me of that Tom Sharpe book. Riotous Assembly or the other one, Indecent Exposure. In it, the local chief of police is paranoid about communist terrorists so he creates a group of secret agents to pose as communists and infiltrate the cell. In order to protect their identities, he doesn’t tell them who else is in the group. So off they go hanging around in bars and acting like they expect communist terrorists to act and looking out for other people acting like they expect communist terrorists to act. There aren’t in fact any terrorists so all that ends up happening is they form their own terrorist cell. To maintain credibility with each other they start organising actual terrorist attacks, which simply hardens the resolve of the chief of police..

    I’m being asked to look into new technologies such as ‘mobile’ and ‘big data’. I’m happy to mess with this from a technology point of view but we’ll just have to see how much business it generates 🙂

    I should add though that my colleage did do a prototype using Hadoop for large batch processing. Of course the same thing could have easily been done via traditional means.

    Fueled
    Free Member

    I work in insurance. We build models (nerd alert) based on 10s of millions of customer records in order to price risks, predict competitor’s rates and predict the price elasticity of individual customers. To me it feels like Big Data, but we have been doing it since way before anyone started talking about big data. This leaves me thoroughly confused as to whether it is cutting edge or way behind the times. I have no idea what people are actually doing with Big Data in other industries.

    I agree that Big Data is a fairly stupid phrase though. I have heard it compared to the International Conference on Very Large Databases, which has been running annually since 1975. Their definition of “very large” much have changed quite a lot over that time.

    molgrips
    Free Member

    The difference between Big Data and simply lots of data is how it’s stored. You can store a lot in a relational database but that’s peanuts compared to what big data techniques can do. It’s meant to be infintely scalable, so you can simply keep on adding processing nodes to process more and more data.

    TheBrick
    Free Member

    Nothing is infinitely scalable the bottleneck is just moved.

    DavidB
    Free Member

    I am but I don’t call it Big Data as that is just ****.

    Here is something bike related

    flange
    Free Member

    Yep, we do a lot of analysis based around the stock markets – looking for insider dealings and so on. Just moved over to Exadata and the results aren’t quite what we thought they’d be considering the costs.

    Whilst very much a marketing term that’s currently all the rage, many sites (world pay, visa, Vodafone) have been doing something very similar for years. I think it’s probably very similar to SOA in that if you have the perfect set of parameters combined with the right infrastructure and the demand from the business then it’s probably the bees bendy bits. But most places wouldn’t really fit ‘the model’. a nice theory, but in practise it doesn’t really work, certainly not for us. Hadoop books currently propping up my bike stand, about the most useful they’ve been…

    molgrips
    Free Member

    How is crime data ‘big’? Are there millions of crimes per second? It’s worse than I thought!

    flange – you’re right, big data techniques are quite specific and much of what it is probably being used for could be done with something else.

    I wrote a PoC creating energy forecasts with around 100k data points, and I parallelised it onto many nodes but that wasn’t done with Hadoop (although it would’ve worked fine) and it wasn’t big data.

    Having said that, Hadoop is a convenient framework for chugging through data – my code did essentially the same thing but Hadoop’s already written.

    joolsburger
    Free Member

    Yup I work in it. We never call it big data that’s just a nonsense term. Is it useful to be able to analyse a single large data set instead of multiple smaller ones? Yes, no, sometimes. People are sold the tech but not the outcome and that’s when it fails. If you’re trying to achieve a specific outcome then it can be very powerful for example linking customers and billions of transactions like the tesco clubcard database.

    molgrips
    Free Member

    Interesting you call it a nonsense term – it’s not to me. If you don’t see the difference I’m wondering how big your data sets are.

    CaptJon
    Free Member

    I’ve got plenty of colleagues doing it (in academia). I do some social network analysis stuff but it isn’t big as n=10-15k

    joolsburger
    Free Member

    None of our consultants ever use it and we work with some of the biggest datasets in the country. I take my steer from them. I feel its nonsense if it has multiple defintions that no one agrees upon. Is 6 billion rows “big data” enough or are we talking at crossed purposes?

    whatnobeer
    Free Member

    The stuff I’m working on is a anywhere from 1 million – 30 million data points, but it’s not ‘big data’. Big Data to me is data that’s huge, bigger than standard relational databases can process easily. Data that numbers thousands or millions of points per second. Tweets, facebook graph interactions, credit card transactions etc It’s a bit wooly to be honest and like other tech buzz words it’s meaning will eventually become to have a fairly standard definition, but it might take a while.

    joolsburger
    Free Member

    Which is why I hate the term it’s a buzz word thats been knocking about for ages but isn’t specific enough to mean much. It sells though which is why terradata,oracle etc make billions from telling people to collect it all like a hoarder filling their house with old newspapers.
    Some scientific uses like weather and climate modelling are justified in using the term as they are working on huge datasets with thousands of variables but even there an actual defintion is hard to come by.

    Still, you know, pays the bills.

    molgrips
    Free Member

    6bn rows.. not that big 🙂

    When I say big data I mean nosql databases, map/reduce and all that crap. Or whatnobeer said – that’s the definition most people use. I wouldn’t use it on a project, because it’s a high level term.

    Anyway, splitting hairs. I briefly considered installing a hypervisor and some VMs on my very old desktop as a hadoop cluster, but then it’s single core so that would be a bit silly 🙂

    joolsburger
    Free Member

    There’s the rub I meet people all the time who’ve been misinformed that 5million customers with 100million transactions is “big data” and they need to spend a gazillion pounds to make any sense of it, it’s annoying. Then they get sold some tech, a bit of software, are given a run book and told their problems are solved.

    molgrips
    Free Member

    Just downloaded page view stats from wikipedia. I thought it was a good fit for hbase but I am now thinking it’s not, should probably just be one big hdfs file maybe.

    More reading tomorrow.

    DavidB
    Free Member

    How is crime data ‘big’?

    It’s not, it’s all the other stuff we are aggregating along with it to analyse that is.

    mikewsmith
    Free Member

    we had a go at predicting the level of calls to a Bushfire advice line based on weather, ground conditions and heaps of other variables. Probably at the small end of big data. It was enough to put us off.

    I saw a couple of decent presentations on it a few years back from IBM I think.

    molgrips
    Free Member

    Mike that sounds more like analytics.

    mikewsmith
    Free Member

    It was close to one end of it, if we had gone much further it would have headed into the big space

    Aidan
    Free Member

    Interesting/amusing article about it on The Reg

    Data Mining, noun: “Torturing data until it confesses … and if you torture it enough, it will confess to anything.”

    http://www.theregister.co.uk/2014/10/20/sanity_now_ending_the_madness_of_data_completism/

    mogrim
    Full Member

    Enjoyed that, cheers Aidan.

    Not sure what Big Data is, but if there’re employment opportunities it’s surely worth looking into…

    brassneck
    Full Member

    When I say big data I mean nosql databases, map/reduce and all that crap

    Thats how I see it. Locally I’ve got nothing that can’t be processed perfectly well in relational databases, as long as the queries are good and the indices right. And then just chuck a ton of virtual hardware at it.

    Elsewhere we have some secret squirrel stuff using it but it’s certainly not breaking into our BI / normal LoB world and I don’t think it will as we don’t have the problems it’s best palced to solve.

    ahwiles
    Free Member

    i spend a lot of time (ok, some) getting rid of data – we’re drowning in the stuff.

    we’ve got laser scanners, and GOM scanners, Alicona, and etc. which are very good at generating massive point clouds, which we then have to ‘thin’ – often chucking away 50% or more.

    (you only need 3 points to define a radius, there’s no point having 5000, especially when that curve is an agreed non-critical feature)

    DaveyBoyWonder
    Free Member

    I work for a worldwide data company and Big Data is something that gets thrown around regularly but as someone has mentioned already, it seems a little vague as to what it is and what to do with it.

    The way I see it, its the stuff thats out there that isn’t really easily categorised. So we deal with all sorts of data – financial, some personal etc. Big Data to me is the stuff you can’t easily pigeonhole so mainly the social media stuff. Trawling social media and pulling out useful stuff from that for example. What that ‘stuff’ is though is anyones guess.

    As an aside, I’ve recently started developing with a new ETL tool that has apps to link into Hadoop etc. I’d be interested to see how that works considering it kind of grounds to a halt if you feed a 10MB Excel file into it given the limitations of the Java code behind it :S

    molgrips
    Free Member

    I just spent a week for an extremely well known and important client tweaking their Java based core system so it doens’t grind to a halt. Drop me a line I can probably help you out 🙂

    llama
    Full Member

    Everyone wants to do it yet they don’t know what it is.

    Just add ‘big data’ and ‘cloud’ to a power point slide and away you go.

    I have done some work with Hadoop and did dome data processing waaaaaay quicker than we’ve ever done with SQL (and that’s with some real SQL nerds). However, the type of processing lent itself well to map/reduce so it was a good fit. We certainly have data volume with as many events as you want per second. Can’t really give any more details than that though.

    the-muffin-man
    Full Member

    I’ve got 1300 odd customer records to cleanse – that’s Big Data to me! 😀

Viewing 40 posts - 1 through 40 (of 126 total)

The topic ‘Big Data’ is closed to new replies.