{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# `enterprise` Data Structures\n",
    "\n",
    "This guide will give an introduction to the unique data structures used in `enterprise`. These are all designed with the goal of making this code as user-friendly as possible, both for the end user and the developer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Warning: cannot find astropy, units support will not be available.\n"
     ]
    }
   ],
   "source": [
    "% matplotlib inline\n",
    "%config InlineBackend.figure_format = 'retina'\n",
    "\n",
    "from __future__ import division\n",
    "\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "from enterprise.pulsar import Pulsar\n",
    "import enterprise.signals.parameter as parameter\n",
    "from enterprise.signals import utils\n",
    "from enterprise.signals import signal_base\n",
    "from enterprise.signals import selections\n",
    "from enterprise.signals.selections import Selection\n",
    "from tests.enterprise_test_data import datadir"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Class Factories\n",
    "\n",
    "\n",
    "The `enterprise` code makes heavy use of so-called class factories. Class factories are functions that return classes (not objects of class instances). A simple example is as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "def A(farg1, farg2):\n",
    "    class A(object):\n",
    "        \n",
    "        def __init__(self, iarg):\n",
    "            self.iarg = iarg\n",
    "        \n",
    "        def print_info(self):\n",
    "            print('Object instance {}\\nInstance argument: {}\\nFunction args: {} {}\\n'.format(\n",
    "                self, self.iarg, farg1, farg2))\n",
    "    return A"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Object instance <__main__.A object at 0x10e356810>\n",
      "Instance argument: iarg1\n",
      "Function args: arg1 arg2\n",
      "\n",
      "Object instance <__main__.A object at 0x1116f4f10>\n",
      "Instance argument: iarg2\n",
      "Function args: arg1 arg2\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# define class A with arguments that can be seen within the class\n",
    "a = A('arg1', 'arg2')\n",
    "\n",
    "# instantiate 2 instances of class A with different arguments\n",
    "a1 = a('iarg1')\n",
    "a2 = a('iarg2')\n",
    "\n",
    "# call print_info method\n",
    "a1.print_info()\n",
    "a2.print_info()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the example above we see that the arguments ``arg1`` and ``arg2`` are seen by both instances ``a1`` and ``a2``; however these instances were intantiated with different input arguments ``iarg1`` and ``iarg2``. So we see that class-factories are great when we want to give \"global\" parameters to a class without having to pass them on initialization. This also allows us to mix and match classes, as we will do in `enterprise` before we instantiate them."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The `Pulsar` class\n",
    "\n",
    "The `Pulsar` class is a simple data structure that stores all of the important information about a pulsar that is obtained from a timing package such as the TOAs, residuals, error-bars, flags, design matrix, etc.\n",
    "\n",
    "This class is instantiated with a par and a tim file. Full documentation on this class can be found [here.](enterprise.html#module-enterprise.pulsar)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "psr = Pulsar(datadir+'/B1855+09_NANOGrav_9yv1.gls.par', datadir+'/B1855+09_NANOGrav_9yv1.tim')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This `Pulsar` object is then passed to other `enterprise` data structures in a loosley coupled way in order to interact with the pulsar data."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The `Parameter` class\n",
    "\n",
    "In `enterprise` signal parameters are set by specifying a prior distribution (i.e., Uniform, Normal, etc.). These `Parameter`s are how `enterprise` builds signals. Below we will give an example of this functionality."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'enterprise.signals.parameter.Uniform'>\n"
     ]
    }
   ],
   "source": [
    "# lets define an efac parameter with a uniform prior from [0.5, 5]\n",
    "efac = parameter.Uniform(0.5, 5)\n",
    "print(efac)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`Uniform` is a class factory that returns a class. The parameter is then intialized via a name. This way, a single parameter class can be initialized for multiple signal parameters with different names (i.e. EFAC per observing backend, etc). Once the parameter is initialized then you then have access to many useful methods."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\"efac_1\":Uniform(0.5,5)\n",
      "efac_1\n",
      "(0.22222222222222221, -1.5040773967762742)\n",
      "[ 2.82288791  3.47338006  1.68693806  4.34250608  4.79228485]\n"
     ]
    }
   ],
   "source": [
    "# initialize efac parameter with name \"efac_1\"\n",
    "efac1 = efac('efac_1')\n",
    "print(efac1)\n",
    "\n",
    "# return parameter name\n",
    "print(efac1.name)\n",
    "\n",
    "# get pdf at a point (log pdf is access)\n",
    "print(efac1.get_pdf(1.3), efac1.get_logpdf(1.3))\n",
    "\n",
    "# return 5 samples from this prior distribution\n",
    "print(efac1.sample(size=5))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The `Function` structure\n",
    "\n",
    "In `enterprise` we have defined a special data structure called `Function`. This data structure provides the user with a way to use and combine several different `enterprise` components in a user friendly way. More explicitly, it converts and standard function into an `enterprise` `Function` which can extract information from the `Pulsar` object and can also interact with `enterprise` `Parameter`s.\n",
    "\n",
    "[**put reference to docstring here**]\n",
    "\n",
    "For example, consider the function:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "@signal_base.function\n",
    "def sine_wave(toas, log10_A=-7, log10_f=-8):\n",
    "    return 10**log10_A * np.sin(2*np.pi*toas*10**log10_f)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Notice that the first positional argument of the function is `toas`, which happens to be a name of an attribute in the `Pulsar` class and the keyword arguments specify the default parameters for this function.\n",
    "\n",
    "The decorator converts this standard function to a `Function` which can be used in two ways: the first way is to treat it like any other function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[  1.98691765e-15   3.97383531e-15   5.96075296e-15]\n"
     ]
    }
   ],
   "source": [
    "# treat it just as a standard function with a vector input\n",
    "sw = sine_wave(np.array([1,2,3]), log10_A=-8, log10_f=-7.5)\n",
    "print(sw)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "the second way is to use it as a `Function`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'enterprise.signals.signal_base.Function'>\n"
     ]
    }
   ],
   "source": [
    "# or use it as an enterprise function\n",
    "sw_function = sine_wave(log10_A=parameter.Uniform(-10,-5), log10_f=parameter.Uniform(-9, -7))\n",
    "print(sw_function)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we see that `Function` is actually a class factory, that is, when initialized with `enterprise` `Parameter`s it returns a class that is initialized with a name and a `Pulsar` object as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<enterprise.signals.signal_base.Function object at 0x10e3567d0>\n"
     ]
    }
   ],
   "source": [
    "sw2 = sw_function('sine_wave', psr=psr)\n",
    "print(sw2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now this `Function` object carries around instances of the `Parameter` classes given above for this particular function and `Pulsar`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[\"sine_wave_log10_A\":Uniform(-10,-5), \"sine_wave_log10_f\":Uniform(-9,-7)]\n"
     ]
    }
   ],
   "source": [
    "print(sw2.params)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Most importantly it can be called in three different ways:\n",
    "If given without parameters it will fall back on the defaults given in the original function definition"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[  5.97588901e-08   5.97588901e-08   5.97588901e-08 ...,  -5.80521219e-08\n",
      "  -5.80521219e-08  -5.80521219e-08]\n"
     ]
    }
   ],
   "source": [
    "print(sw2())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "or we can give it new fixed parameters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ -7.23515356e-09  -7.23515356e-09  -7.23515356e-09 ...,   5.93768399e-09\n",
      "   5.93768399e-09   5.93768399e-09]\n"
     ]
    }
   ],
   "source": [
    "print(sw2(log10_A=-8, log10_f=-6.5))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "or most importantly we can give it a parameter dictionary with the `Parameter` names as keys. This is how `Function`s are use internally inside `enterprise`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ -7.23515356e-09  -7.23515356e-09  -7.23515356e-09 ...,   5.93768399e-09\n",
      "   5.93768399e-09   5.93768399e-09]\n"
     ]
    }
   ],
   "source": [
    "params = {'sine_wave_log10_A':-8, 'sine_wave_log10_f':-6.5}\n",
    "print(sw2(params=params))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Notice that the last two methods give the same answer since we gave it the same values just in different ways. So you may be thinking: \"Why did we pass the `Pulsar` object on initialization?\" or \"Wait. How does it know about the toas?!\". Well the first question answers the second. By passing the pulsar object it grabs the `toas` attribute internally. This feature, combined with the ability to recognize `Parameter`s and the ability to call the original function as we always would are the main strengths of `Function`, which is used heavily in `enterprise`.\n",
    "\n",
    "Note that if we define a function without the decorator then we can still obtain a `Function` via:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'enterprise.signals.signal_base.Function'>\n"
     ]
    }
   ],
   "source": [
    "def sine_wave(toas, log10_A=-7, log10_f=-8):\n",
    "    return 10**log10_A * np.sin(2*np.pi*toas*10**log10_f)\n",
    "\n",
    "sw3 = signal_base.Function(sine_wave, log10_A=parameter.Uniform(-10,-5), \n",
    "                           log10_f=parameter.Uniform(-9, -7))\n",
    "\n",
    "print(sw3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Make your own `Function`\n",
    "\n",
    "To define your own `Function` all you have to do is to define a function with these rules in mind.\n",
    "\n",
    "1. If you want to use `Pulsar` attributes, define them as positional arguments with the same name as used in the `Pulsar` class (see [here](enterprise.html#module-enterprise.pulsar) for more information.\n",
    "2. Any arguments that you may use as `Parameter`s must be keyword arguments (although you can have others that aren't `Parameter`s)\n",
    "3. Add the `@function` decorator.\n",
    "\n",
    "And thats it! You can now define your own `Function`s with minimal overhead and use them in `enterprise` or for tests and simulations or whatever you want!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The `Selection` structure\n",
    "\n",
    "In the course of our analysis it is useful to split different signals into pieces. The most common flavor of this is to split the white noise parameters (i.e., EFAC, EQUAD, and ECORR) by observing backend system. The `Selection` structure is here to make this as smooth and versatile as possible.\n",
    "\n",
    "The `Selection` structure is also a class-factory that returns a specific selection dictionary with keys and Boolean arrays as values.\n",
    "\n",
    "This will become more clear with an example. Lets say that you want to split our parameters between the first and second half of the dataset, then we can define the following function:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def cut_half(toas):\n",
    "    midpoint = (toas.max() + toas.min()) / 2\n",
    "    return dict(zip(['t1', 't2'], [toas <= midpoint, toas > midpoint]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This function will return a dictionary with keys (i.e. the names of the different subsections) `t1` and `t2` and boolean arrays corresponding to the first and second halves of the data span, respectively. So for a simple input we have:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'t2': array([False, False,  True,  True], dtype=bool), 't1': array([ True,  True, False, False], dtype=bool)}\n"
     ]
    }
   ],
   "source": [
    "toas = np.array([1,2,3,4])\n",
    "print(cut_half(toas))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To pass this to `enterprise` we turn it into a `Selection` via:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'enterprise.signals.selections.Selection'>\n"
     ]
    }
   ],
   "source": [
    "ch = Selection(cut_half)\n",
    "print(ch)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As we have stated, this is class factory that will be initialized inside `enterprise` signals with a `Pulsar` object in a very similar way to `Function`s."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<enterprise.signals.selections.Selection object at 0x111be3450>\n",
      "{'t2': array([False, False, False, ...,  True,  True,  True], dtype=bool), 't1': array([ True,  True,  True, ..., False, False, False], dtype=bool)}\n"
     ]
    }
   ],
   "source": [
    "ch1 = ch(psr)\n",
    "print(ch1)\n",
    "print(ch1.masks)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `Selection` object has a method `masks` that uses the `Pulsar` object to evaluate the arguments of `cut_half` (these can be any number of `Pulsar` attributes, not just `toas`). The `Selection` object can also be called to return initialized `Parameter`s with the split names as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{u't1_efac': \"B1855+09_t1_efac\":Uniform(0.1,5.0), u't2_efac': \"B1855+09_t2_efac\":Uniform(0.1,5.0)}\n",
      "{u't1_efac': array([ True,  True,  True, ..., False, False, False], dtype=bool), u't2_efac': array([False, False, False, ...,  True,  True,  True], dtype=bool)}\n"
     ]
    }
   ],
   "source": [
    "# make efac class factory\n",
    "efac = parameter.Uniform(0.1, 5.0)\n",
    "\n",
    "# now give it to selection\n",
    "params, masks = ch1('efac', efac)\n",
    "\n",
    "# named parameters\n",
    "print(params)\n",
    "\n",
    "# named masks\n",
    "print(masks)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Make your  own `Selection`\n",
    "\n",
    "To define your own `Selection` all you have to do is to define a function with these rules in mind.\n",
    "\n",
    "1. If you want to use `Pulsar` attributes, define them as positional arguments with the same name as used in the `Pulsar` class (see [here](enterprise.html#module-enterprise.pulsar) for more information.\n",
    "2. Make sure the return value is a dictionary with the names you want for the different segments and values as boolean arrays specifying which points to apply the split to. \n",
    "3. A selection does not have to apply to all points. You can make it apply to only single points or single segments if you wish.\n",
    "\n",
    "And thats it! You can now define your own `Selection`s with minimal overhead and use them in `enterprise` or for tests and simulations or whatever you want!\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## `Signal`s, `SignalCollection`s, and `PTA`s oh my!\n",
    "\n",
    "**Coming Soon!**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
