Doing Research

Advice on and resources for doing research

11/12/2018

Advice on/Resources for doing (economic) research

  • Collaborate via GitHub, not only for sharing code. [Link]
  • Some blogs on doing research: Research Whisperer [Link], Research Fortnight [Link]
  • A very interesting special of Nature on Irreproducible Research, also highly relevant for Economics imo. [Link]
  • Short article on "How to persuade journals to accept your (political science) replication paper". [Link]

Economic research

  • Stay informed about new working papers added to REPEC via its NEP service. [Link]
  • Research-based economic policy publications by CEPR (with a focus on the UK) [Link], as well as from Germany (in German) [Link] and from Italy (in Italian). [Link]

Data sources

  • Collections of links to freely accessible datasets (with development economics in mind, but the datasets can be useful for many other economists):
    • by Markus Eberhardt from Nottingham [Link]. Note the subpages for micro and macro data.
    • by Masayuki Kudamatsu [Link]
    • Harvard Dataverse: a collection of research datasets, mainly for replication of published papers, but then some published papers are about creating datasets. Among others the REStat uses this platform. For the Economcis of Innovation there are a number of interesting datasets available here (just search, e.g., for "Compustat"). [Link]
    • Jesse Perla has a nice collection of links to data for (macro-)innovation studies. [Link] Also offers a GitHub tutorial, some advice on Julia, and how to prepare your Maths background for Econ PhD programmes.
  • Non-patent innovation data:
    • Country level: eurostat Science, Technology and Innovation data [Link]
    • Firm level survey data:
      • Commuity Innovation Survey [Link]
      • Available either on CD-Rom or via on-site access at eurostat in Luxembourg
      • Spanish Technological Innovation Panel (in Spanish), data available on request [Link]
  • Patent data in particular:
    • NBER patent data project---the "seminal" patent data set for academic research [Link]
      • includes aggregation of patents at the assignee level and a match of assignee names to Compustat company identifiers
      • Presentation by Bronwyn Hall on how to use patent data as indicators [Link]
        • also provides a wealth of links on patent data on her private website [Link]
    • Ivan Png provides the above mentioned Compustat data plus detailed data on R&D locations (1979-1998) and US trade secret law (1975-2008) [Link]
    • Harvard Patent Inventor database: patent data aimed at research on the inventors behind patents, described in Li, Lai, , D'Amour, Doolin, Sun, Torvik, Yu, & Fleming, (2014). Disambiguation and Co-authorship Networks of the US Patent Inventor database (1975--2010). Research Policy, 43(6), 941--955. [Link]
    • Match of patent assignee names to CRSP company identifiers, described in Kogan, Papanikolaou, Stoffman & Seru (forthcoming) Technological Innovation, Resource Allocation and Growth. Quarterly Journal of Economics. [Link]
      • covers longer time period than the NBER match, also apparently some corrections of matching patents to assignees
    • PATSTAT: SQL-queried datase of raw patent data provided by the European Patent Office, but containing data from about 90 different patent offices. Subscription is fee-based, but the two-month trial account offers the full functionality and therefore is perfectly sufficient for obtaining the data for one (or two) research project(s). [Link]
    • PATSTAT-(I)CRIOS database: data taken from PATSTAT, but "elaborated by CRIOS to produce a cleaned and harmonized database". Access on request. [Link]
    • OECD patent databases: require application, but should be granted relatively easily; particularly interesting are the Triadic Patent Faimilies data [Link currently broken]
    • Collection of links to patent data by the USPTO [Link]
      • Research datasets ready to be used, compiled by the USPTO's Office of the Chief Economist which has been very active in this issue in recent years; data in particular on claims, the examination process, and changes in assignment [Link]
      • Patentsview.org, the USPTO's new patent data visualisation tool, also offers bulk data downloads of all sorts, including patent full texts. These data come in extremely raw form and require major processing efforts before being usable in empirical analysis, but at least you don't have to scrape them yourself anymore. [Link]
    • Chinese Patent Data Project [Link]
    • Singapore-Melbourne Patent Database [Link]
    • Jeffrey Kuhn and co-authors have developed a number of interesting datasets based on patent data, including similarity of citing patents and patent examiner toughness [Link]
  • Patent litigation:
    • U.S. Patent Litigation Statistics 2000-2013 [Link]
  • Concordance tables: as patent data is categorised by technology, but most of economic research questions are dealing with industries, concordance tables can be quite important, even though for obvious reasons they can never be perfect.
    • NAICS to SIC [Link]
    • Index of Correspondence Tables provided by eurostat [Link]
  • General IO data:
    • Linking patent data to firm data is standard practice today. US firm data is usually taken from Compustat or the Compustat-CRSP merged database [Link], both usually accessed via the WRDS data service.
    • A source of data on European firms is Bureau van Dijk's Amadeus database. Amadeus contains European firms of all sorts of size categories, depending on the individual subscription, with data quality usually increasing in firm size. The same supplier also offers Orbis (a kind of Amadeus version with worldwide coverage) and Osiris (restricted to publicly-traded firms). Amadeus optionally contains European patents matched to firms (since 2017 this requires an additional subscription fee). An alternative could be to obtain patent data from PATSTAT and then match to Amadeus, e.g. with Bruegel's Python-based REMERGE algorithm [Link]

Software and Languages

  • Of general applicability: keyboard short cuts for a lot of (popular) software. [Link] More specialist software is (sometimes) covered by their helpfiles, like Mathematica and Maple.
  • Bibliography: Zotero [Link]
    • Bibliographical data syncs with the Zotero server for free. Your PDFs (and whatever other files you want to link to your references) you can have synced via a cloud service of your choice by choosing to link instead of attach files in Zotero, possible via the ZotFile plugin.
    • Zutilo is another very useful plugin, for eample allowing you to copy tags from one reference to another, easily create sections of a book reference, and relate (linking) several references with each other.
  • Computer Algebra / Symbolic Computation Systems:
    • Introduction by Zdzislaw Meglicki (University of Indiana); a bit dated, but the basic syntax didn't change; also includes a nice comparison chart for Maple, Mathematica and Maxima [Link]
    • Maple:
      • Tutorials by Tobin Driscoll (Delaware) [Link]
    • Mathematica:
      • "A moderately paced practical tutorial for Mathematica programming language" by Leonid Shifrin (also available as PDF) [Link]
      • A great introductory text from the times when it was apparently not yet embarassing to admit you don't know how to use Mathematica: Stinespring, J.R. (2002). Mathematica for Microeconomics. Learning by Example. London, UK: Academic Press. Sadly out of print and never updated, though most of the syntax has survived until today. [Link]
    • Maxima: open-source alternative to Maple and Mathematica (and the MATLAB Symbolic Math Toolbox). Even if you're doing applied econometrics, sometimes you can't avoid taking a derivative in Economics. [Link]
      • With Stinespring (2002) out of print, a similar current alternative exists for Maxima: Hammoch, M., & Mixon, W. (2013) Microeconomic Theory and Computation: Applying the Maxima Open Source Computer Algebra System. New York, NY: Springer. The website accompanying the book offers Maxima workbook files and exercises for every chapter. [Link]
      • A Maxima tutorial from the WU Vienna [Link]
  • Statistics:
    • "Quick Guides" to Statistical Software by Kurt Schmidheiny (U Basel): Stata/Mata, R, Matlab, Maple, EViews [Link].
      • Also provides some comprehensive econometrics lecture notes.
    • Stata:
      • Another link collection by Masayuki Kudamatsu [Link]
      • Tutorials:
        • Kind of the Standard in Stata tutorials by the Institute for Digital Research and Education, UCLA. Includes annotated output and coding advice. [Link]
        • Stata Tutorial by Germán Rodríguez (Princeton U) [Link]
        • A collection of some further "every-day advice": [Link]
        • The tutorial datasets used by Cameron & Trivedi (2005, 2009) are available [here].
      • A collection of advice on how to use Stata on "Big Data" sets:
        • Working with very large datasets: Stata Forum [Link], NBER webpage [Link], Stata documentation on Managing Memory [Link], Stata website on the max number of observations fitting in given memory [Link]
        • Stata programs for big data: FTOOLS, Gtools, fastxtile
        • learn regular expressions for handling text data [Link]
        • Boosting and Clustering Visualisation modules by Matt Schonlau [Link] (also contributed some phdtips.com)
      • A couple of Stata programs that I use(d at some point):
        • creating regression tables, Latex and HTML documents [Link]
    • R:
      • tutorials by Simon Ejdemyr [current and archived]
      • Basic but comprehensive introduction into statistics, plus R and SAS (in German). [Link]
      • "Applied Econometrics with R" (Uni of Innsbruck) [Link]
      • Einführung in R für Datenanalyse und Zeitreihen (WU Wien), außerdem Einführung in LaTeX. [Link]
      • "Try R" by Code School (free; other than that Code School seems like a paid-for version of codecadamy). [Link]
  • Programming:
    • Super brief introductions in almost any programming language by Learn X in Y minutes. [Link]
    • Another great resource for learning how to code: Codecadamy. Use it to learn SQL to be able to use the PATSTAT database! You need to sign up, but it's free. [Link] Code
    • If you're going to learn a single programming language these days, go for Python. Some introductions: [Link], [Link], [Link], [Link] and [Link].
  • LaTeX:
    • ShareLaTeX: generally I prefer offline applications over anything online, but co-authoring a LaTeX document can be a pain. This is where ShareLaTeX shines, besides its amazing LaTeX documentation. Projects with two authors are free. [Link]
    • Create your own Beamer template (R-bloggers) [Link] (you will have to use LaTeX in your presentations, since among economicsts PowerPoint is considered a "sufficient statistic for lack of content" [forgot the reference...])
    • BibTeX (organising your citations in LaTeX):
      • Getting Started with Biblatex (ShareLaTeX) [Link]
      • Zotero with LaTeX and BibTeX (MIT Libraries) [Link]
    • Templates: Typesetting your academic CV in LaTeX (by Dario Taraborelli, Wikimedia) [Link]
  • PDF:
    • Sejda offers free online versions of PDF manipulation functionality that is usually not available in free versions [Link]
  • Text processing:

Learning resources

  • General introductions:
    • Handbooks: ScienceDirect (by Elsevier) offers reference books for most branches of Economics, downloadable from most unis [Link]. Also reaching beyond economics, for example with the great International Encyclopedia of the Social & Behavioral Sciences [Link]
    • Super simple introductory videos for almost any topic of mainstream economics, by the Marginal Revolution blog.
    • Collection of links to lecture notes from first- and second-year graduate courses [Link]
  • Mathematics:
    • Proofs in Mathematics [Link]
    • Collection of solution manuals created by John L. Weatherwax, a man gifted with extraordinary productivity [Link]
  • (Applied) Econometrics:
    • Freely available Econometrics texts:
      • Online Statistics Education (Rice / Houston / Tufts) [Link] (also offers a PDF download)
      • As good as a textbook: lecture notes on Applied Econometrics by the amazing Christopher Baum (Boston College) [Link]
    • Some issues in econometrics (in particular of relevance when working with patent data):
      • Discussion of causality in econometrics textbooks (Chris Auld, University of Victoria) [Link]
    • Collection of textbook recommendations for econometrics, financial economics, and game theory. [Link]
  • Microeconomics:
    • Harald Wiese is offering textbook-style lecture notes on various topics in Microeconomics [Link & Link] (have a look at the respective lecture's subpage)
    • Lecture Notes on "Wettbewerbstheorie und -politik" by Florian Englmaier [Link]
    • Various handouts on topics in basic-intermediate micro and macro by Nicholas J. Sanders [Link]
    • Reading list on Micro Theory and Game Theory [Link].
  • Game Theory:
    • Freely available textbook:
      • Game Theory: Parts I and II with 88 solved exercises. An open access textbook, by Giacomo Bonanno (UC Davis) [Link]
    • Collection of Game Theory Lecture Notes from BA to PhD level, compiled by Mike Shor (Vanderbilt). Sadly celebrating their 10th anniversary of not being maintained. [Link]
  • Industrial Organisation:
    • Two freely available textbooks:
      • At the undergrad level, with a focus on antitrust and regulation:
        • IO: A Strategic Approach, by Jeffrey Church (U Houston) and Roger Ware (Queen's U, Kingston) (2000) [Link]
    • At the graduate level, assuming you know some Maths:
      • IO, a Contract Based Approach, by Nicolas Boccard (Girona) (2013) [Link]
  • Networks:
    • Two freely available (text)books in Network Theory, both at undergrad level (both books are used in the course by Ozalp Babaoglu at the Department of Computer Science at Bologna):
      • with a focus a bit more on the Maths: Graph Theory and Complex Networks: An Introduction, by Maarten van Steen (Twente) (2010) [Link]
      • with some economics context: Networks, Crowds, and Markets: Reasoning about a Highly Connected World, by David Easley and Jon Kleinberg (both Cornell) (2010) [Link]
  • Research Methods: Short talks providing introductions into Research Methods in the Social Sciences from Manchester [Link]
  • Further sources of information worth knowing about. [Link]