Skip to content
December 17, 2011 / Kirsty Pitkin

IDCC 11: What Did You Say?

  1. Opening Keynote

    Ewan McIntosh got the audience talking with his inspiring presentation about public data opportunities, in which he challenged the audience to learn from seven year olds to create wonder with stories about their data….
  2. McIntosh: What is the secret sauce of data? Why does some data have more impact than others? #IDCC11
  3. #idcc11 Ewan McIntosh – the general public are one of our main stakeholders, they are who we are creating data for, they can curate data
  4. McIntosh: Tell good stories about your data to get impact. Explain it in a compelling way #idcc11 #tedtalks
  5. McIntosh: I talk about ‘data’ but you all talk about ‘Data’. Data has a snobbery problem and a communication problem #idcc11
  6. McIntosh “create a sense of wonder”. Nice example using “Debtris” visualization for international debt #idcc11
  7. 5 lessons from seven year olds: 1. tell a story 2. create curiosity 3. create wonder 4. solve pain 5. create a reason to trade data #idcc11
  8. Now that was one of the best keynotes I have heard all year. #idcc11
  9. in short (to the #idcc11 audience), “nobody knows or cares what you do because you don’t use your data to tell stories or inspire wonder”
  10. A full summary of Ewan’s presentation can be found in this DCC blog post:
  11. Organisational Perspectives

  12. David Lynn from the Wellcome Trust provided us with the funders’ perspective on the issue of open data and outlined some of the efforts of the Wellcome Trust to encourage greater data sharing…
  13. DL: key challenges for data sharing: infrastructure (storage), cultural (incentives), technical (standards), professional & ethical #idcc11
  14. Lynn: Regulatory framework should be proportionate and risk-based, ensure does not unnecessarily restrict research, use of datasets #idcc11
  15. Jeff Haywood from the University of Edinburgh provided an institutional perspective by describing how they are looking after the university’s data…
  16. According to #idcc11 a new/old definition of curation must be: Storytelling. Haywood asks how do you preserve the story and data #IDCC11
  17. Haywood: pressure for RDM came primly from internal researcher demand #idcc11 <- wonder how typical this is?
  18. Jeff’s talk also led to the “most mixed metaphor” tweet of the event (as nominated by @andypowe11)
  19. Heywood: carrots are more effective than sticks. Herding cats is dead easy – you put fish at the other end of the room… #idcc11
  20. Andrew Charlesworth, Director of the Centre for IT and Law (CITL) at the University of Bristol gave a legal perspective on the issue of open data, arguing for better education for researchers about the issues and a resistance to over legalisation…
  21. #idcc11 Charlesworth Not just interested in data but also backstory (processes/methods) this is what may get challenged
  22. Charlesworth: new data curation verb coined – make metadata ‘stick’ (adhesion) #idcc11
  23. Andrew Charlesworth (Bristol) on avoiding undue “legalisation”. Should social science data be treated the same way as medical data? #idcc11
  24. Andrew Charlesworth: training crucial for researchers etc. to deal with FOI, litigation issues, confidential data, ethics #idcc11 #jiscmrd
  25. A full summary of the Organisational Perspectives strand can be found in this DCC blog post:
  26. Researcher Perspectives

  27. Mark Hahnel discussed FigShare, a tool aimed at helping researchers to gain credit for all of their research outputs by making them available in a citable manner…
  28. Hahnel #idcc11 “What I think researchers need to get for open data is a little ego boost”
  29. “if you go lower and lower, you can always find a niche where you’re number one” – hahnel on researchers’ egos at #idcc11
  30. #idcc11 Hahnel – People are putting their research videos on YouTube but it isn’t citable, no DOI, same with images on Flickr
  31. #idcc11 Figshare looks elegant and provides persistent identifier for various kinds of research object to aid citation.
  32. @FigShare looks fantastic. Love going to conferences and learning about something new and useful #idcc11
  33. Victoria Stodden from the Department of Statistics at Columbia University spoke about reproducible research and explained why reproducibility can be used to effectively frame the open data debate for scientists…
  34. Stodden: The concept of reproducability is understood by all scientists and is a strong incentive for promoting longevity of data #idcc11
  35. VS on credibility crisis. Looked at code availability in JASA articles. In 1996 0% of code available. In 2011 = 21%. #idcc11
  36. Stodden: Reproducibility allows you to talk about the code in the same discussions as the data. Without code, you don’t have data #idcc11
  37. Heather Piwowar, a DataONE postdoc with NESCent discussed data use attribution and impact tracking, arguing for open data about data…
  38. @researchremix Sharing is hard work and you don’t want to be the one others build on, you want to be the top dog! #idcc11 #jiscmrd
  39. HP: CV’s boring and static. What if they showed “what difference did it make”? Nice segue into @totalimpactdev #idcc11
  40. Wonderful! @researchremix call to arms: ‘Get excited and make things!’ #idcc11 ‘The future is open; open data and open data about our data!’
  41. A full summary of the Researcher Perspectives strand can be found in this DCC blog post:
  42. Posters

  43. Quite a few infrastructure and workflow posters at #idcc11 this year. What kinds of *operational/financial* benchmarking exists for repos?
  44. DCC Symposium

  45. The DCC Symposium was presented by Liz Lyon from UKOLN and Adam Hedgecoe from the ESRC Centre for Economic & Social Aspects of Genomics (Cesagen) at Cardiff University.  The debate focused on dealing with very personal, genomic data, and linking in with the UK Prime Minister David Cameron’s recent announcements about NHS patient data….
  46. Openness prized but causing probs in genomics community. E.g., you can deaggregate genetic information to match to the individual. #idcc11
  47. #idcc11 General principle of medical tests s “Don’t do a test unless the result might affect your actions.” Not true of much DTC genomics
  48. Ashley: once data is open, it’s difficult to be picky abt who will use it and why. #idcc11
  49. Summing Up

  50. Cliff Lynch from CNI provided a wide ranging summary of the day, full of this type of insight…
  51. Lynch: Questions whether we should simply replicate existing citation pratice, with all of its flaws, in the data domain #idcc11
  52. A full summary of his remarks can be found in this DCC blog post:
  53. Open data driving scholarly communications in 2020

  54. Professor Philip E Bourne from the Department of Pharmacology at University of California San Diego and Editor-in-Chief of PLoS Computational Biology gave a remote presentation to the conference via Skype to outline what he sees as the future for data publishing…
  55. Phil Bourne: we can’t currently manage the data we’re producing, must think about cost vs benefit, what to keep, and the long tail #idcc11
  56. #idcc11 15 yr old bugged authors to get their data and wrote a paper which is accepted for review by Science!
  57. #idcc11 Bourne: Users want more of the data associated with the literature and more of the literature associated with the data. No surprise.
  58. Bourne calls for a data registry, i.e. individual repositories register metadata e.g. all data relating to a particular gene #idcc11
  59. Best Peer Reviewed Paper

    The award for the best peer-reviewed paper submitted to IDCC 11 went to William Underwood from Georgia Tech Research Institute for his paper: Grammar-based Specification and Parsing of Binary File Formats
  60. #idcc11 Underwood describing an approach to file format validation which builds on 40 yrs of experience with programming languages
  61. Parallel Sessions

  62. IDCC 11 featured six parallel sessions on topics ranging from Data Management & Planning, Workflows, Policy & Legal Issues, Preservation, Formats and Environmental Data.  Here are just a few audience responses to those sessions…
  63. Poschen outlines some of the problems researchers have with data: makes me think it’s a miracle they ever manage to do any research! #idcc11
  64. “data is a first class research output and should therefore be published” – simple australian goodness from #ands at #idcc11
  65. #FoI misnomer: ‘if I store my data outside the uni systems e.g. on a usb, it won’t be subject to the requirements’ #idcc11 #risk
  66. #idcc11 #3b “we share because we do science, not alchemy”
  67. #idcc11 McNally durability: timing of data flow has to be synchronised w/ domain specific temporal dynamics eg. time stamp problems #jiscmrd
  68. Closing Keynote

  69. The closing keynote was presented by Natasa Milic-Frayling from Microsoft Research Cambridge (MSRC), who spoke about advancing research and technology to safeguard our digital future…
  70. Milic-Frayling: We have committed to express some of our deepest insights and knowledge in a form we don’t yet understand very well #idcc11
  71. Milic-Frayling #idcc11 when we think of the digital need to think of the evolution – digital media is a victim of its own success: rapid dev
  72. #idcc11 Milić-Frayling: with digital, need the perspective of keeping the content alive, rather than just maintaining a static “trophy”
  73. Watch IDCC 11

  74. A full gallery of videos from the event is available here:
  75. More information about the 7th International Digital Curation Conference is available from the event website.

    Images by Tim Gander.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: