Friday, November 11, 2011

Reporting Requirements Swap Data Repository - DODD FRANK Regulation

BI Reporting Requirements for over-the-counter (OTC) 
by DODD-FRANK Regulators  - Part 1
Compiled by Suvradeep rudra

New reporting and record keeping rules generally distinguish between two categories of information
ü  Swap creation data (such as the primary economic terms of the swap and confirmation data.
ü  Swap continuation data (such as event data, valuation information and term changes.

·         Swap execution facilities (“SEFs”)
·         Designated contract markets (“DCMs”)

1.       TOP  SWAPS
a.       Top 10 record (Swaps ) in $ values  for the Day
b.      Top 10 record (Swaps ) in $ values  for the Month
c.       Top 10 record (Swaps ) in $ values  for the Qtr

Reports should include following columns           
                                                               i.       Unique Counterparty Identifier (UC )
                                                             ii.      Unique Swap Identifier (USI)
                                                            iii.      Unique Product Identifier (UPI)
                                                           iv.      Start Date
                                                             v.      Expiration date
                                                           vi.      $ amount

Ø  The UCI would identify the legal entity that is a counterparty to a swap.  Under the proposed rules, the Commission would require use of UCIs in all swap data reporting.
Ø  The Unique Swap Identifier (USI) called for by the proposed rules would be created and assigned to a swap at the time it is executed, and used to identify that particular swap transaction throughout its existence.
Ø  The Unique Product Identifier (UPI) called for by the proposed rules would categorize swaps according to the underlying  products referenced in them. While the UPI would be assigned to a particular level of the taxonomy of the asset class or sub asset class in question, its existence would enable the Commission and other regulators to aggregate transactions at various  taxonomy levels based on the type of product underlying the swap.

2.       Reporting of Swap Creation Data – Executed on a Platform

 The Dodd-Frank Act lays the foundation, defining a SEF to be "a facility, trading system or platform in which multiple participants have the ability to execute or trade swaps by accepting bids and offers made by other participants that are open to multiple participants in the facility or system, through any means of interstate commerce."

The expected role of a SEF is to provide pre- and post-trade transparency, encourage competitive execution for the entire institutional marketplace, and provide the tools required to ensure a complete record and audit trail of trades. There could be a significant shift in the way derivatives trading is ultimately executed, and Tradeweb has made great strides to be ahead of the curve for our clients.

Legislative bodies in the U.S. and Europe are moving to increase regulation of the over-the-counter (OTC) derivatives market. These global financial reform initiatives seek to achieve three key objectives for the OTC derivatives markets:
·         Increase transparency
·         Improve market efficiency
·         Prevent market abuse
How Derivatives are Currently Traded
Over-the-counter, or "privately negotiated", derivatives are currently traded on the telephone and increasingly on electronic markets, such as Tradeweb. There are two sectors of the market: institutional dealer-to-client (D2C) and inter-dealer (D2D). These markets are approximately the same size in terms of trading volumes, but there are many more participants in the D2C marketplace than D2D.

Reporting Counterparty
ü  Swap Dealers (SD) and Major Swap Participants (MSP)
ü  Non-SD/MSP Counterparties

Ø  Report  1 - Executed on a platform and cleared
Ø  Report  2 - Executed on a platform and not cleared
Ø  Report  3 -  Not executed on a platform and cleared
Ø  Report  4 - Not executed on a platform and not cleared
Ø  Report  5 - Credit and Equity Asset Classes – Cleared
Ø  Report  6 - Credit and Equity Asset Classes –Not  Cleared
Ø  Report  7 - Interest Rate, Currency, and Other Commodity Asset Classes – Cleared
Ø  Report  8 - Interest Rate, Currency, and Other Commodity Asset Classes – Not Cleared

Wednesday, November 9, 2011

Where Hadoop Fits in

Where Hadoop Fits in
By Suvradeep Rudra

Why Hadoop ?
•Overcome traditional limitation of storage and compute.
•Leverage commodity hardware as inexpensive platform.
•Ease of Linear Scalability.
•Open Source Software

Big Data Values – For Innovation and Productivity

•Can unlock significant value by making information transparent and usable at much higher frequency
•Companies are using data collection and analysis to conduct controlled experiments to make better management decisions
•Organizations create and store more transactional data in digital form, they can collect more accurate and detailed performance information on everything from product inventories to sick days, and therefore expose variability and boost performance.
• Big data allows ever-narrower segmentation of customers and therefore much more precisely tailored products or services
•Big data can be used to improve the development of the next generation of products and services. (preventive measures that take place before a failure occurs or is even noticed).

Big Data - Challenges
•Policies related to privacy, security, intellectual property, and even liability will need to be addressed in a big data world
•Need for right talent and technology in place but also structure workflows and incentives to optimize the use of big data.
•Access to data is critical— challenges will be to integrate information from multiple data sources, often from third parties, and the incentives.
•Shortage of talent necessary for organizations to take advantage of big data

About OTC Derivatives

What is a Derivative?
The financial instruments we've considered so far - stocks, bonds, commodities and currencies - are generally referred to as cash instruments (or sometimes, primary instruments). The value of cash instruments is determined directly by markets. By contrast, a derivative derives its value from the value of some other financial asset or variable. For example, a stock option is a derivative that derives its value from the value of a stock. An interest rate swap is a derivative because it derives its value from an interest rate index. The asset from which a derivative derives its value is referred to as the underlying asset. The price of a derivative rises and falls in accordance with the value of the underlying asset. Derivatives are designed to offer a return that mirrors the payoff offered by the instruments on which they are based. 

What Does Derivative Mean?
A security whose 
price is dependent upon or derived from one or more underlying assets. The derivative itself is merely a contract between two or more parties. Its value is determined by fluctuations in the underlying asset. The most common underlying assets include stocks, bonds, commodities, currencies, interest rates and market indexes. Most derivatives are characterized by high leverage.

Investopedia explains Derivative
Futures contracts, forward contracts, options and swaps are the most common types of derivatives. Derivatives are contracts and can be used as an underlying asset. There are even derivatives based on weather 
data, such as the amount of rain or the number of sunny days in a particular region. 

Derivatives are generally used as an instrument to hedge risk, but can also be used for speculative purposes. For example, a European investor purchasing shares of an American company off of an American exchange (using U.S. dollars to do so) would be exposed to exchange-rate 
risk while holding that stock. To hedge this risk, the investor could purchase currency futures to lock in a specified exchange rate for the future stock sale and currency conversion back into Euros.

By way of example, a few standard derivatives are listed below. The most commonly-traded derivatives - forwards, futures, options and swaps - will be described in greater detail. 

A credit derivative is an OTC derivative designed to transfer credit risk from one party to another. By synthetically creating or eliminating credit risk from one party to another. By synthetically creating or eliminating credit exposures, they allow institutions to more effectively manage credit risks. Credit derivatives take many forms. Three basic structures include: credit default, total return and credit linked swaps

Read more:
Over-The-Counter - OTC
What Does Over-The-Counter - OTC Mean?
A security traded in some context other than on a formal exchange such as the NYSE, TSX, AMEX, etc. The phrase "over-the-counter" can be used to refer to stocks that trade via a dealer network as opposed to on a centralized exchange. It also refers to debt securities and other 
financial instruments such as derivatives, which are traded through a dealer network.

Investopedia explains Over-The-Counter - OTC
In general, the reason for which a stock is traded over-the-counter is usually because the company is small, making it unable to meet exchange listing requirements. Also known as "unlisted stock", these securities are traded by broker-dealers who negotiate directly with one another over 
computer networks and by phone.

Although Nasdaq operates as a dealer network, Nasdaq stocks are generally not classified as OTC because the Nasdaq is considered a stock exchange. As such, OTC stocks are generally unlisted stocks which trade on the Over the Counter Bulletin Board (OTCBB) or on the pink sheets. Be very wary of some OTC stocks, however; the OTCBB stocks are either penny stocks or are offered by companies with bad 
credit records.

Instruments such as bonds do not trade on a formal exchange and are, therefore, also considered OTC securities. Most debt instruments are traded by investment banks making markets for specific issues. If an investor wants to buy or sell a bond, he or she must call the bank that makes the market in that bond and asks for quotes.

Read more:
Read more:
Read more:
Read more:
Read more:

Thursday, June 30, 2011

Customer Analytics

Customer Analytics is a buzz word now,but how can it help  companies to market more effectively. In today's world tech savvy customers use spam filters to avoid marketing emails  they don't  want.

therefore in this new environment, marketing  process should be based on analytical  framework to do a target marketing instead mass marketing. it really makes difference !!!!

Here are some important component of that framework..........................

Analytically driven Customer Segmentation 
This process involves dividing the customer base into groups, who are similar in a specific way and can be targeted as a whole.It enables the marketing companies to target the group efficiently and with right resources
The overall benefit is a profitable campaign as you know your target better.

Grouping is done based on process which identifies specific needs and preferences. therefore these groups are hand picked for specific campaigns and programs. Having said that,it helps in more positive response than mass campaign.

Predictive Modeling
This component predict future customer behavior based on past activities. It provides insight into behavior patterns of  a company's best and worst customers. It also predicts company's best and worst customer and the customers that are likely to leave.

Marketing Optimization
Based on above 2 factors, we can handle marketing challenges ( budget constrains,channel constrains,customer contact policies ) more efficiently.

In a Nut shell ..................

  • Increase response rate,customer loyalty and ROI.
  • Reduce campaign costs.
  • Decease customer attrition.
  • Deliver right message to the customer.

Wednesday, June 29, 2011

Top Benefits of Marketing Analytics

Customer Analytics is a buzz word now,but how can it help  companies to market more effectively. In today's world tech savvy customers use spam filters to avoid marketing emails  they don't  want.

therefore in this new environment, marketing  process should be based on analytical  framework to do a target marketing instead mass marketing. it really makes difference !!!!

Here are some important component of that framework..........................

Analytically driven Customer Segmentation 

This process involves dividing the

Predictive Modeling

Marketing Optimization

Friday, May 20, 2011

The UC San Diego Extension staff targets the following 14 niche careers as sectors to watch for

Data Mining

Data mining is an exploding industry, largely due to the massive amount of data generated by the population's use of technology and the Web, which can be used to predict trends and consumer behavior. A study out of UC Berkeley shows that the amount of data in the world doubles every three years. Career prospects include advertising technology, fraud detection, risk management and law enforcement. Data mining requires understanding of algorithms and advanced statistics as well as programming and computer management.

Saturday, May 7, 2011

Multiple Linear Regression

Multiple Linear Regression : A technique to analyzing certain types of multivariate data. This can helps us to understand the relationship between a response variable and one or more predictor variables. This is to estimate the value of the response variable knowing the values of predictor variable.

Y as response variable : which also know as dependent variable or outcome or simply an output variable. This variable should be Qualitative having meaningful numerical values.
X as predictor variables : X1,X2....Xn are the predictor variables. It is also known as input variable or covariates. This variable(s) should also be quantitative.

The multiple linear regression model can be represented mathematically as an algebraic relationship between response variable and one or more predictive variable.

Investopedia explains Multiple Linear Regression - MLR
MLR takes a group of random variables and tries to find a mathematical relationship between them. The model creates a relationship in the form of a straight line (linear) that best approximates all the individual data points. 

MLR is often used to determine how many specific factors such as the price of a commodity, interest rates, and particular industries or sectors, influence the price movement of an asset. For example, the current price of oil, lending rates, and the price movement of oil futures, can all have an effect on the price of an oil company's stock price. MLR could be used to model the impact that each of these variables has on stock's price.

Some Examples :

  1. A data set consisting of the gender, height and age of children between 5 and 10 years old. You could use multiple linear regression to predict the height of a child (dependent variable) using both age and gender as predictors (i.e., two independent variables).
  2. The current price of oil, lending rates, and the price movement of oil futures, can all have an effect on the price of an oil company's stock price
  3. An excellent example is a study conducted by an American University, on quantifying relationship between final exam score of a student and no of hrs spend partying during last week of the term.Therefore Y is exam score and the predictor variables are X1 = Hrs spend studying, X2 = Hrs spend in partying.

The goal of multiple linear regression (MLR) is to model the relationship between the explanatory and response variables.

The model for MLR, given n observations, is:

yi = B0 + B1xi1 + B2xi2 + ... + Bpxip + Ei where i = 1,2, ..., n

Multiple Regression Model

Let's take a real time example of predicting the sale price of homes (sale price in $ thousands)
based on the two predictor variables

  1. Floor Size ( in Sq feet thousands)
  2. Lot size     (category, home built on large amount of land will have much higher price than a home with less land, all else being constant. therefore we can categories 0-3k sq feet as category 1,3-5k category 2 and so on  up to category 10.
After calculation we got the best fit model Y = 122.36+61.9*X1+7.09*X2

or   Price of home = 122.36 + 61.9*floor size +7.09 * lot size /category

having said that, we conclude that sale price will increase $6200 for each 100 sq foot increase floor size  when lot size is a constant.
Similarly ,  sale price will increase $709 for each category increase,being floor size constant.

The above calculation based on multiple regression techniques describes how to identify whether changing one variable is associated with a change in other variable and NOT establish changing one variable will cause other to change. 

How to evaluate the MODEL
The basic Qs is , how to evaluate the model that if it is a good fit or not. We use generally 3 standard methods to numerically evaluate how well a regression model fits sample data.
The methods are .....

  1. The regression standard errors.
  2. Co-eff of determination R2
  3. Slope Parameter 
The regression standard errors:

Coming back to our last example,Price of home = 122.36 + 61.9*floor size +7.09 * lot size /category, We found that "root mean square error" is 2.4752 using SAS as a statistical SW to calculate the regression value.
As we have 2 predictor variables  X1,X2, therefore the standard errors will be 2s, or 2* 2.4752. = 4.95.

At 95% confidence interval, we can say that we can accurately calculate the home price to be accurate to with in the range of +_ $4950.

Co-eff of determination R2 :
To explain  effect of  R2 in layman's term,is that it represent the % of reliability of the model in term of regression relationship between response variable and predictor variables.
The lies between 0 to 1 which means 0% to 100%, in our example the value of R2 we got is 0.9717. This translates to 97.17%  of variation is sales price of homes has linear regression relationship between sale price and (floor size,lot size).
The greater the value, the better the fit.

Adjusted R2 (R square):
Unfortunately R2 is not a reliable to guile model building because if we add a predictor to a model ,R2 either increases or stays same.
Therefore a better way is to use Adjusted R2,which provides a good fit with out over fitting. It can be used to guide the model building  since it decreases when an extra unimportant predictors have been added to the model.
Coming back to our example, adjusted R2 is .9528 , which means 95.28% of variation is sales price of homes has linear regression relationship between sale price and (floor size,lot size). Which is more accurate than our last prediction of  97.17%

Thursday, May 5, 2011

Data Warehouse 2.0

As per,Inmon , the father of data warehousing,

The data warehouse is a basis for informational processing. It is
defi ned as being
■ subject oriented;
■ integrated;
■ nonvolatile;
■ time variant;
■ a collection of data in support of management’s decision

Problem of traditional data Warehouse Models...
1. Active Data Warehouse - Difficulty in maintaining the transaction integrity,Capacity planning,Processing conflict,Cost.
2. Federated Data Warehouse - very poor performance,no data integrity, no history of data,improper grains.
3. Star Schema  -  Unavailable for change,limited to optimization,useful only when at lowest grain.
4. Data Mart - Problem with Data reconciliation,maintenance issues,rigt design not flexible to change, implement future changes are difficult.

Building the REAL Data Warehouse

The data Warehouse is divide in to 4  Sectors. These are ...
  • Very Current (aka Interactive Sector)  - Data that is as old as 2 sec. 
  • Current (aka Integrated Sector) - Data is as old as 24 hrs.
  • Near Line (aka Near Line Sector ) - Data is as older 3 -4 years.
  • Archival (aka Archival Sector) - Data is older than 5 years.
The infrastructure is held together by metadata
Data access at  quick as it is based on sectors
Data Archiving is done automatically
Less volume of data at each section.

Interactive Sector - There is only a modest amount of data that is found in the Interactive Sector.the volumes of interactive data that are found here are small. The interactive data almost always resides on disk storage.In addition to having fast performance, the transactions that are run through the Interactive Sector are able to do updates. Data can be added, deleted, or modified.

Integrated Sector - This is where data is organized into major subject areas and where detail is kept.The summary data found in the Integrated Sector is summary data that is used in many places and summary data that doesn’t change.The data is granular: There are a lot of atomic units of data to
be collected and managed. The data is historical: There is often from 3 to 5 years ’ worth of data. The data comes from a wide variety of sources.

Near Line Sector - Performance is enhanced by downloading data with a low probability of access to the Near Line Sector. Because only data with a low probability of access is sent to the Near Line Sector, the data remaining in disk storage in the Integrated Sector is freed from the overhead of “ bumping into ” large amounts of data that is not going to be used.
Upon leaving the Near Line Sector, data normally moves into the Archival Sector. Note that the Archival Sector may be fed data directly from the Integrated Sector without passing through the Near Line
Sector. However, if the data has been moved into the Near Line Sector, then it is normally moved from there to the Archival Sector.The movement of data to the Archival Sector is made when the probability of accessing the data drops significantly.
Archival Sector - When data is sent to the Archival Sector, it may or may not be appropriate to preserve the structure that the data had in the integrated or near-line environments. There are advantages and disadvantages to both preserving the structure of the data and not preserving the structure of the data. One advantage of preserving the structure of the data as it passes into the Archival Sector is that it is an easy thing to do.
The data is simply read in one format and written out in the same format. That is about as simple as it gets. But there are some reasons this approach may not be optimal. One reason is that once the data becomes archived, it may not be used the same way it was in the integrated.