Wednesday, 8 February 2012

performance tuning

This is the first of the number of articles on the series of Data Warehouse Application performance tuning scheduled to come every week. This one is on Informatica performance tuning.

Source Query/ General Query Tuning

1.1 Calculate original query cost
1.2 Can the query be re-written to reduce cost?
  • Can IN clause be changed with EXISTS?
  • Can a UNION be replaced with UNION ALL if we are not using any DISTINCT cluase in query?
  • Is there a redundant table join that can be avoided?
  • Can we include additional WHERE clause to further limit data volume?
  • Is there a redundant column used in GROUP BY that can be removed?
  • Is there a redundant column selected in the query but not used anywhere in mapping?
1.3 Check if all the major joining columns are indexed
1.4 Check if all the major filter conditions (WHERE clause) are indexed
  • Can a function-based index improve performance further?
1.5 Check if any exclusive query hint reduce query cost
  • Check if parallel hint improves performance and reduce cost
1.6 Recalculate query cost
  • If query cost is reduced, use the changed query

Tuning Informatica LookUp

2.1 Redundant Lookup transformation
  • Is there a lookup which is no longer used in the mapping?
  • If there are consecutive lookups, can those be replaced inside a single lookup override?
2.2 LookUp conditions
  • Are all the lookup conditions indexed in database? (Uncached lookup only)
  • An unequal condition should always be mentioned after an equal condition
2.3 LookUp override query
  • Should follow all guidelines from 1. Source Query part above
2.4 There is no unnecessary column selected in lookup (to reduce cache size)
2.5 Cached/Uncached
  • Carefully consider whether the lookup should be cached or uncached
  • General Guidelines
  • Generally don't use cached lookup if lookup table size is > 300MB
  • Generally don't use cached lookup if lookup table row count > 20,000,00
  • Generally don't use cached lookup if driving table (source table) row count < 1000
2.6 Persistent Cache
  • If found out that a same lookup is cached and used in different mappings, Consider persistent cache
2.7 Lookup cache building
  • Consider "Additional Concurrent Pipeline" in session property to build cache concurrently
  • "Prebuild Lookup Cache" should be enabled, only if the lookup is surely called in the mapping

Tuning Informatica Joiner

3.1 Unless unavoidable, join database tables in database only (homogeneous join) and don't use joiner
3.2 If Informatica joiner is used, always use Sorter Rows and try to sort it in SQ Query itself using Order By (If Sorter Transformation is used then make sure Sorter has enough cache to perform 1-pass sort)
3.3 Smaller of two joining tables should be master

Tuning Informatica Aggregator

4.1 When possible, sort the input for aggregator from database end (Order By Clause)
4.2 If Input is not already sorted, use SORTER. If possible use SQ query to Sort the records.

Tuning Informatica Filter

5.1 Unless unavoidable, use filteration at source query in source qualifier
5.2 Use filter as much near to source as possible

Tuning Informatica Sequence Generator

6.1 Cache the sequence generator

Setting Correct Informatica Session Level Properties

7.1 Disable "High Precision" if not required (High Precision allows decimal upto 28 decimal points)
7.2 Use "Terse" mode for tracing level
7.3 Enable pipeline partitioning (Thumb Rule: Maximum No. of partitions = No. of CPU/1.2) (Also remember increasing partitions will multiply the cache memory requirement accordingly)

Tuning Informatica Expression

8.1 Use Variable to reduce the redundant calculation
8.2 Remove Default value " ERROR('transformation error')" for Output Column.
8.3 Try to reduce the Code complexity like Nested If etc.
8.4 Try to reduce the Unneccessary Type Conversion in Calculation

tracing level


                 
                  Tracing level represents the amount of information that informatcia server writes in a log file.
Types of tracing level
Normal
Verbose
Verbose init
Verbose data
When you configure a transformation, you can set the amount of detail the Integration Service writes in the session log.
The session log tracing levels are:

1. Normal: Integration Service logs initialization and status information, errors encountered, and skipped rows due to transformation row errors. Summarizes session results, but not at the level of individual rows.

2. Terse: Integration Service logs initialization information and error messages and notification of rejected data.

3. Verbose Initialization: In addition to normal tracing, Integration Service logs additional initialization details, names of index and data files used, and detailed transformation statistics.

4. Verbose Data: In addition to verbose initialization tracing, Integration Service logs each row that passes into the mapping. Also notes where the Integration Service truncates string data to fit the precision of a column and provides detailed transformation statistics.
By default, the tracing level for every transformation is Normal.

Monday, 29 August 2011

IMP SQL used in INFORMATICA

1) display duplicate rows?

sql> select deptno from emp group by deptno having count(*)>1;

  2) delete duplicate rows?

sql> delete from emp e1 where rowid>(select min(rowid) from emp e2 where e1.deptno=e2.deptno);

  3) update emp sal based on hiredate?

sql> update emp set sal=sal+500 where hiredate in(select hiredate from emp having count(*)>1 group by hiredate);

  4) display top n max sal?

sql> select rownum,sal from (select sal from emp order by sal desc)where rownum<=5;

  5) display the 5th record of the table?

sql> select *from emp where empno=(select empno from emp where rownum<5 minus select empno from emp where rownum<4);

  6) select nth max salary?

sql> select min(sal) from (select distnict (sal) from emp order by sal desc) where rownum<='&n';

  7) display rownum with records?

sql> select rownum,emp. *from emp;

  8) Display the records between two range?

sql> select rownum,ename,empno from emp where rowid in(select rowid from emp where rownum <=&upto minus select rowid from emp where rownum<&Start);

  9) Odd number of records?

sql> select * from emp where (rowid,1) in (select rowid, mod(rownum,2) from emp);

  10) Even number of records?

sql> select * from emp where (rowid,0) in (select rowid, mod(rownum,2) from emp);
  
  11) How To Display last 5 records in a table?

sql> select * from (select rownum r, emp.* from emp) where r between (Select count(*)-5 from emp)
      and
     (Select count(*) from emp);


Thursday, 25 August 2011

SQL


SQL




SQL is a standard language for accessing databases. It stand for structured query language.
RDBMS stands for Relational Database Management System. The data in RDBMS is stored in database objects called tables. A table is a collection of related data entries and it consists of columns and rows.

SQL DML and DDL
SQL can be divided into two parts: The Data Manipulation Language (DML) and the Data Definition Language (DDL).

The query and update commands form the DML part of SQL:
  • SELECT - extracts data from a database 
SELECT column_name(s) FROM table_name 
  • UPDATE - updates data in a database: It is used to update existing records in a table.
  •  
  • UPDATE table_name SET column1=value, column2=value2  WHERE some_column=some_value
  •  
  • DELETE - deletes data from a database
  •  
  • The DELETE statement is used to delete records in a table.
    DELETE FROM table_name WHERE some_column=some_value 
  •  The INSERT INTO Statement : It is used to insert a new row in a table.
  • It is possible to write the INSERT INTO statement in two forms. The first form doesn't specify the column names where the data will be inserted, only their values:
    INSERT INTO table_name VALUES (value1, value2, value3,...)

    The second form specifies both the column names and the values to be inserted:
    INSERT INTO table_name (column1, column2, column3,...)
    VALUES (value1, value2, value3,...)
The DDL part of SQL permits database tables to be created or deleted. It also define indexes (keys), specify links between tables, and impose constraints between tables. The most important DDL statements in SQL are:
  • CREATE DATABASE - creates a new database : - CREATE DATABASE database_name
  • ALTER DATABASE - modifies a database
  • CREATE TABLE - creates a new table
  • CREATE TABLE Persons:
    (P_Id int,LastName varchar(255),FirstName varchar(255),Address varchar(255),City varchar(255)
  •  
  • ALTER TABLE - modifies a table
  • The ALTER TABLE statement is used to add, delete, or modify columns in an existing table.

    ALTER TABLE table_name
    ADD column_name datatype
        ALTER TABLE table_name
    DROP COLUMN column_name
     
  • ALTER TABLE table_name
    ALTER COLUMN column_name datatype
  •  
  • DROP TABLE - deletes a table: :- DROP TABLE table_name
  • CREATE INDEX - creates an index (search key)
  • DROP INDEX - deletes an index
Distinct:
In a table, some of the columns may contain duplicate values.  The DISTINCT keyword can be used to return only distinct (different) values.

SELECT DISTINCT column_name(s)  FROM table_name;
  
WHERE Clause :
The WHERE clause is used to extract only those records that fulfill a specified criterion.
SELECT column_name(s) FROM table_name WHERE column_name operator value ;

The AND & OR Operators The AND operator displays a record if both the first condition and the second condition is true.The OR operator displays a record if either the first condition or the second condition is true.

Ex: SELECT * FROM Persons WHERE LastName='Svendson' AND (FirstName='Tove' OR FirstName='Ola')
 The ORDER BY Keyword The ORDER BY keyword is used to sort the result-set by a specified column. The ORDER BY keyword sort the records in ascending order by default.
SELECT column_name(s) FROM table_name ORDER BY column_name(s) ASC|DESC

SQL Wildcards can substitute for one or more characters when searching for data in a database.SQL wildcards must be used with the SQL LIKE operator.With SQL, the following wildcards can be used:

WildcardDescription
%A substitute for zero or more characters
_A substitute for exactly one character
[charlist]Any single character in charlist
[^charlist]
[!charlist]


SQL JOIN:

The JOIN keyword is used in an SQL statement to query data from two or more tables, based on a relationship between certain columns in these tables.Tables in a database are often related to each other with keys.
Different SQL JOINs
Before we continue with examples, we will list the types of JOIN you can use, and the differences between them.
  • INNER JOIN: Return rows when there is at least one match in both tables 
SELECT column_name(s) FROM table_name1 INNER JOIN table_name2 ON table_name1.column_name=table_name2.column_name 
  • LEFT JOIN: Return all rows from the left table, even if there are no matches in the right tableSELECT column_name(s) FROM table_name1 LEFT JOIN table_name2 ON table_name1.column_name=table_name2.column_name
  • RIGHT JOIN: Return all rows from the right table, even if there are no matches in the left table 
SELECT column_name(s) FROM table_name1 RIGHT JOIN table_name2 ON table_name1.column_name=table_name2.column_name
  • FULL JOIN: Return rows when there is a match in one of the tables
 SELECT column_name(s) FROM table_name1 FULL JOIN table_name2 ON table_name1.column_name=table_name2.column_name

The SQL UNION Operator

The UNION operator is used to combine the result-set of two or more SELECT statements.
Notice that each SELECT statement within the UNION must have the same number of columns. The columns must also have similar data types. Also, the columns in each SELECT statement must be in the same order.
SELECT column_name(s) FROM table_name1
UNION
SELECT column_name(s) FROM table_name2
Note: The UNION operator selects only distinct values by default. To allow duplicate values, use UNION ALL.

SELECT column_name(s) FROM table_name1
UNION ALL
SELECT column_name(s) FROM table_name2
PS: The column names in the result-set of a UNION are always equal to the column names in the first SELECT statement in the UNION.

The SELECT INTO statement selects data from one table and inserts it into a different table.
The SELECT INTO statement is most often used to create backup copies of tables.

SQL SELECT INTO Syntax

We can select all columns into the new table:
SELECT column_name(s)
INTO new_table_name [IN externaldatabase]
FROM old_tablename where condition






ok

Informatica interview questions and answers part 2


Informatica interview questions and answers part 2
Informatica Interview Questions and answers
1. What are the different modes of data processing available in Informatica server?
The Informatica server can have the data movement and processing done in Unicode or ASCII (1 byte format) modes. Code page validation and relationships are affected by the mode in which it operates. The IS can be configured to work in either of these modes with configuration parameter changes and a fresh start. The Unicode takes up 2 bytes for the normal ASCII characters and 3 bytes for non-ASCII values like characters from Japanese, Chinese language.

2. Explain code page compatibility.
The code page compatibility refers to the data movement characteristic between two code pages. If the codes pages are similar to each other with same characters sets, then there is no loss of data. The target code page can be superset of the other code page which means that it has all the characters that are represented in the other code page. But if the target is just a subset (not all unicode characters in the code page can be matched to characters in the target page) then these pages are not compatible as there will be data losses in the transformation.

3. What are the different threads that are created by the DTM?
The various threads that are created by the Data Transformation Manager are:
●        The single master thread that coordinates all the other threads that are created.
●        The mapping thread that manages each session and mapping parameters.
●        The sessions have the pre and post session operations carried out by separate threads.
●        There is a reader and writer thread for reading from source and to write to the final target from the source pipeline respectively.
●        Each partition is also associated with its own transformation thread.

4. What are sessions? How can you combine executions using batches?
The session refers to the set of instructions that have to be executed to transform/move the data from the sources to the specified targets. The pmcmd command or the session manager can be used to execute the session. The executions of the sessions can be clubbed in a serial or parallel manner by using batch execution. Each batch consists of multiple sessions that may execute in Sequential style or in a Concurrent manner (parallel execution) in the IS.

5. How is mapping variable different from mapping parameter?
The mapping variable represents the value that can change during the execution of the session. The final value used for the variable is stored by the Informatica server after the completion and is used again when the session is run again. The mapping parameters on the other hand are specific values that are maintained constant throughout the execution. You can define the parameters and their usage within the mapping procedure. Before start, these parameters are initialized with the specific value you assign.

6. Why is the aggregator cache file required?
The aggregator transformations have to be completed in chunks of instructions per run and the aggregator stores the intermediate values encountered in the buffer memory locally. If the processing required additional memory then the aggregator creates additional cache files to store the values of the transformation.

7. What does look up transformation mean?
The look up transformations are the those which have to access the RDBMS based datasets. The accesses are quickened up by the Informatica server by having the look up table with indexes pointing to specific data in the tables or views of the database. The look up condition is matched for all the look up ports that are issued in the transformation and the final data is retrieved.

8. What are stand-alone sessions? How can they be recovered in case of failure?
A session which is not attached to a batch is referred to as stand-alone session. The stand-alone session can be recovered from failed run by using the pmcmd command or by using the server manager interface. The server manager has the “server requests” menu for each session. You just highlight the session you want to recover and then select server requests -> stop it and then start it again. The same can be done with the use of pmcmd command in the command line.
9. How can you obtain the reports about the repository without having to deal with any SQL or other transformations?
The metadata reporter is the one that is used for the generation of reports regarding the repository. The web app does not require any knowledge about the various SQL queries etc.
10. Is the Fact table normalized or denormalized one?
The FACT table is always maintained a denormalized one with foreign keys present in it against the various primary key attributes of the dimension tables.

11. What is the difference between filter transformations and router transformations?
The filter transformations test for a single condition and the records that do not satisfy the condition are taken off. The router transformations on the other hand can check for multiple conditions (better to use this instead of multiple filter transformations) and the rows that do not satisfy the specific condition can be categorized into separate categories.

12. What do you mean by Enterprise Data Warehousing?
The EDW is the concept of generating a single point of access of organization information. There is a single data warehouse which stores all the information and it gives a global view of the data that is present on the server. This also gives the opportunity for the analysis to be done periodically over the same source. The development time for the centralized warehouse however is reasonably higher, but with better results.

13. What do you mean by a domain?
A domain refers to the collection of all the related nodes and other relationships that come under the single administrative point. This makes the entire data base more manageable.

14. What is a mystery dimension and a junk dimension?
The mystery dimension is the one which stores the sensitive data that has to be hidden from the view of others. The junk dimension is the one which accumulates all the unwanted information from the specific record. Those parameters which do not follow under a specific category or column are organized into a manageable dimension referred to as junk dimension. A fact sheet that is related to company but not really important for the database operations can be seen as an example.

15. What is the difference between the repository server and the powerhouse server?
The power house refers the management entity which is used to control the execution of the different processes across the components in the informatica server’s database repository. The repository server on the other hand is the control center for the the entire repository, including the various tables and other procedures etc. It maintains the consistency and integrity of the repository.

16. Why do we partition a session?
The partitioning of a session pipeline refers to the various independent execution sequences in the session. This can be used to improve the efficiency and the performance of the server. The extraction, transformations and the related output for each of these can be carried out in parallel for each partition that is being created.

17. What is a rank index?
The values are generated for each of the ranks that area associated with the different ports is referred to as rank indices.
18. What is the difference between active and passive transformations?
The active transformations are the ones that do the changes to the number of rows. For example the filter transformations actually move the rows involved in the transformation if they are not meeting the condition. The passive transformations are those which just parse all the rows and create some inference from those. Example for such type of transformations are the ones like expression transformations.

19. How can you eliminate duplicate rows from a flat file?
The sorter transformation applied with the distinct operation over the rows and keys will take every port as a part of the sort key. This when done will eliminate the entries of rows with the same values for the rows.

20. How can you created indexes after the load process is completed?
The session level command tasks can be used. The scripts for creating the indexes can be associated with session’s workflow or with the post-session execution sequence. But such index generations cannot be handled at the transformation level after the load process.

Informatica Interview Questions and answers


Informatica Interview Questions and answers
Informatica Interview Questions and answers
1) What are the type of repositories created using Informatica Repository Manager?
Ans) Informatica PowerCenter includes following type of repositories:
Standalone Repository, which functions individually.
Global Repository which is a centralized repository in a domain and it also contain shared objects across the repositories in a domain.
Local Repository is one which is within a domain.
Versioned Repository can be either local or global but it allows version control.

2) What is a code page?
Ans) A code page contains encoding to specify characters in a set of one or more languages and is selected based on source of the data. The set code page refers to a specific set of data that describes the characters the application recognizes. This influences the way that application stores, receives, and sends character data.

3) What do you mean by code page compatibility?
Ans.>When two code pages are compatible, the characters encoded in the two code pages are virtually identical which ensures no data loss. This compatibility is used for accurate data movement when the Informatica Sever runs in the Unicode data movement mode. One code page can be a subset or superset of another. For proper data movement, the target code page must be a superset of the source code page.

4) What is a transformation?
Ans.>A transformation is a repository object that generates, modifies, or passes data. The Designer provides a set of transformations that perform specific functions. Each transformation has rules for configuring and connecting in a mapping. Transformation is created to use once in a mapping or reusable transformations can be created to use in multiple mappings.
Eg. Aggregator transformation performs calculations on groups of data.

5) What are the types of loading in Informatica?
Ans.>There are two types of loading in informatica, normal loading and bulk loading.
In normal loading record by record are loaded and writes log for that. In this longer time is needed to load data to the target.
In bulk loading number of records are loaded at a time to target database. It takes less time to load data to the target than in normal loading.

6)Why do we use the lookup transformation? How can we improve session performance in aggregator transformation? 
Ans.>A lookup transformation is used for checking the matched values from the source or target tables and check whether the record already existing in the table. It is also used for updating the slowly changing dimensions and also performs some calculations.
Using Incremental Aggregation we create Sorted Input option to improve the performance since performance is reduced using the caches.

7) What is the difference between static cache and dynamic cache?
Ans.>In case of dynamic cache, when we are inserting a new row it checks the lookup cache to see if it exists, if not inserts it into the target as well as the cache but in case of static cache the new row is written only in the target and not the lookup cache.
The lookup cache remains static and does not change during the session but incase of dynamic cache the server inserts, updates in the cache during session.

8) What is throughput in Informatica? Where can we find this option to check? How does it works?
Ans.>Throughput is the rate at which power centre server read the rows in bytes from source or write the rows in bytes into the target per second.
We can find this option in workflow monitor.
Its working is as follows: right click on session choose properties and transformation statistics tab, there we can find throughput details for each instance of source and target.

9) What is a source qualifier? What do you mean by Query Override?
Ans.>Source Qualifier represents the rows that the PowerCenter Server reads from a relational or flat file source when it runs a session. When the definition of the relational or flat file is added to mapping then it is connected to Source Qualifier transformation. The default query is SELECT statement containing all the source columns. Source Qualifier has capability to override this default query by changing the default settings of the transformation properties.

10)What is aggregate cache in aggregator transformation?
Ans.>The aggregator stores data in the aggregate cache until it completes aggregate calculations. When we run a session that uses an aggregator transformation, the informatica server creates index and data caches in memory to process the transformation. If the informatica server requires more space, it stores overflow values in cache files.

11) What are two types of processes that runs the session?
Ans.>The two types of processes that runs the session are Load Manager and DTM process.
Load manager process starts the session, creates DTM process, and sends post session email when the session completes.
DTM process creates threads to initialize the session, read, write and transform data and handle pre-session and post-session operations.

12)What are the difference between joiner transformation and source qualifier transformation?
Ans.>In joiner transformation heterogeneous data sources can be joined but this cannot be achieved incase of source qualifier transformation.
We need matching keys to join two relational sources in source qualifier transformation whereas this is not needed incase of joiner. Two relational sources should come from same data source in source qualifier. We can join relational sources which are coming from different sources also.

13)What is Datadriven?
Ans.>The informatica server follows instructions coded into update strategy transformations within the session mapping which determine how to flag records for insert, update, delete or reject. If we do not choose data driven option setting, the informatica server ignores all update strategy transformations in the mapping.

14)What are the types of mapping wizards that are provided in Informatica?
Ans.>The designer provide two mapping wizard.
Getting Started Wizard creates mapping to load static facts and dimension tables as well as slowly growing dimension tables.
Slowly Changing Dimensions Wizard, creates mappings to load slowly changing dimension tables based on the amount of historical dimension data we want to keep and the method we choose to handle historical dimension data.

informatica based sql


Oracle SQL Interview Questions and Answers.


1. How do I eliminate the duplicate rows ?

Ans: delete from table_name where rowid not in (select max(rowid) from table group by
duplicate_values_field_name);
or

delete duplicate_values_field_name dv from table_name ta where rowid <(select
min(rowid) from table_name tb where ta.dv=tb.dv);

2.How do I display row number with records?

Ans:Select rownum,emp.* from emp

3.Display the records between two range?

Ans: 
select rownum, empno, ename from emp where rowid in (select rowid from emp where rownum
<=&upto minus select rowid from emp where rownum<&Start);
Enter value for upto: 10
Enter value for Start: 7

4.I know the nvl function only allows the same data type(ie. number or char or date
Nvl(comm, 0)), if commission is null then the text “Not Applicable” want to display,
instead of blank space. How do I write the query?

Ans:select nvl(to_char(comm.),'NA') from emp;

5. Find out nth highest salary from emp table?

Ans:SELECT DISTINCT (a.sal) FROM EMP A WHERE &N = (SELECT COUNT (DISTINCT (b.sal)) FROM
EMP B WHERE a.sal<=b.sal);

or

SELECT * FROM (SELECT DISTINCT(SAL),DENSE_RANK() OVER (ORDER BY SAL DESC) AS RNK FROM EMP) WHERE RNK=&N

or

select min(sal) from (select distinct sal from emp order by sal desc) where rownum <=&n

6. Find out nth highest salary DEPT wise from emp table?

Ans:SELECT * FROM (SELECT DISTINCT(SAL),DENSE_RANK() OVER (PARTITION BY DEPTNO ORDER BY SAL DESC) AS RNK FROM EMP) WHERE RNK=&N

7. Display Odd/ Even number of records?

Ans:Odd number of records:
select * from emp where (rowid,1) in (select rowid, mod(rownum,2) from emp);

Even number of records:
select * from emp where (rowid,0) in (select rowid, mod(rownum,2) from emp);

8.What are the more common pseudo-columns?

Ans: SYSDATE, USER , UID, CURVAL, NEXTVAL, ROWID, ROWNUM

9.How To Display last 5 records in a table?

Ans: select * from (select rownum r, emp.* from emp) where r between (Select count(*)-5 from emp)
and
(Select count(*) from emp)

10.How To Display last record in a table?

Ans: select * from (select rownum r, emp.* from emp) where r in (Select count(*) from emp)

11. How To Display particular nth record in a table?

Ans: select * from (select rownum r, emp.* from emp) where r in (2) or r=2

12.How To Display even or odd records in a table?

Ans:select * from (select emp.* , rownum r from emp) where mod (r,2)=0

13. What is the difference between a HAVING CLAUSE and a WHERE CLAUSE?

Ans:Specifies a search condition for a group or an aggregate. HAVING can be used only with the SELECT
statement. HAVING is typically used in a GROUP BY clause. When GROUP BY is not used, HAVING
behaves like a WHERE clause. Having Clause is basically used only with the GROUP BY function in a
query. WHERE Clause is applied to each row before they are part of the GROUP BY function in a query.

14.What is sub-query? Explain properties of sub-query?

Ans:Sub-queries are often referred to as sub-selects, as they allow a SELECT statement to be executed
arbitrarily within the body of another SQL statement. A sub-query is executed by enclosing it in a set of
parentheses. Sub-queries are generally used to return a single row as an atomic value, though they
may be used to compare values against multiple rows with the IN keyword.
A subquery is a SELECT statement that is nested within another T-SQL statement. A subquery SELECT
statement if executed independently of the T-SQL statement, in which it is nested, will return a result
set. Meaning a subquery SELECT statement can standalone and is not depended on the statement in
which it is nested. A subquery SELECT statement can return any number of values, and can be found
in, the column list of a SELECT statement, a FROM, GROUP BY, HAVING, and/or ORDER BY clauses of a
T-SQL statement. A Subquery can also be used as a parameter to a function call. Basically a subquery
can be used anywhere an expression can be used.

15.Properties of Sub-Query

Ans: A subquery must be enclosed in the parenthesis.
A subquery must be put in the right hand of the comparison operator, and
A subquery cannot contain a ORDER-BY clause.
A query can contain more than one sub-queries.

16. What are types of sub-queries?

Ans:Single-row sub query, where the subquery returns only one row.
Multiple-row sub query, where the subquery returns multiple rows,.and
Multiple column subquery, where the sub query returns multiple columns.

17. what is the out put for query select * from emp where rownum<=3

Ans: it display first 3 Records

18.what is the out put for query select * from emp where rownum=1;

Ans: it display first Record in the table

19. what is the out put for query select * from emp where rownum=2;
Ans: it will not display any record

20.what is the out put for query select * from emp where rownum>1;

Ans: even this also will not display the records. why because when it fetch the first record rownum is 1 so condition fail so it will not get first record when it fetches 2nd record rownum is again 1 because it didn't pick up first record so 2nd time also condition failed.

21. How to display Top N salaries in emp?

Ans: select * from (select distinct sal from emp order by sal desc) where rownum<=&n

22. How To display Last Record in emp table?

Ans: Select * from ( select rownum as rn,emp.* from emp) where rn in(select count(*) from emp)

23. How To display First and last Records in emp table?

Ans:select * from ( select rownum as rn,emp.* from emp) where rn in(1,(select count(*) from emp))

24. How to Diplay 1,5,8 records in emp table?

Ans: select * from ( select rownum as rn,emp.* from emp) where rn in (1,5,8)

25. In Oracle, can we add a Not Null column to a table with data? If "No" then how
can we do that?

Ans:No, we cannot add a Not Null column to a table with data. Oracle throws Error ORA-01758.
See example below!

Eg: alter table EMP add comm2 number not null

Error: ORA-01758: table must be empty to add mandatory (NOT NULL) column.

Workaround:

Provide a Default value to the column being added, along with the NOT NULL constraint. Then the
column will get added with the default value for all existing rows.

Eg: alter table EMP add comm2 number not null default 100 -- Comm2 will have 100 for all rows

26. While doing an ascending order sort on a column having NULL values, where does
the NULLs show up in the result set? At the beginning or at the end?

Ascending order sort - NULLs come last because Oracle treats NULLs are the largest possible values

Descending order sort - NULLs come first

* How to make NULLs come last in descending order sort?

Add NULLS LAST to the order by desc clause

Eg: select col1 from table1 order by col1 desc NULLS LAST

27. how to set Time of execution of an SQL Statement

first run this in sql prompt: set timing on
After execution of each query we get the time take for it
if you don't want run this : set timing off

28.What is the Datatype of NULL in Oracle?

Ans:Datatype of NULL is "char(0)" and size is '0'

29.Oracle Functions - Replace versus Trim

SQL> select replace('jose. antony@ yahoo.com',' ', null) as Replace1 from dual;

REPLACE1
--------------------
jose.antony@yahoo.com  --Removes all spaces from in-between

SQL> select trim('jose. antony@ yahoo.com') as Trim1 from dual;


TRIM1
----------------------
jose. antony@ yahoo.com --Removes spaces from both sides only

30. Explain ROWID in Oracle?

ROWID is a unique hexadecimal value which Oracle inserts to identify each record being inserted. It is
used for all Full Table scans.

Structure:

OOOOOOFFFBBBBBBRRR

OOOOOO - First six characters is the Object Number which idenities the Data Segment
FFF - Next 3 characters is the Database File number
BBBBBB - Next 6 characters shows the DataBlock number
RRR -Next 3 characters identified the Row within the block

31. What is difference between Co-related sub query and nested sub query?

Correlated subquery runs once for each row selected by the outer query. It contains a reference to a value from the row selected by the outer query.

Nested subquery runs only once for the entire nesting (outer) query. It does not contain any reference to the outer query row.

For example,

Correlated Subquery:

select e1.empname, e1.basicsal, e1.deptno from emp e1 where e1.basicsal = (select max(basicsal) from emp e2 where e2.deptno = e1.deptno)

Nested Subquery:

select empname, basicsal, deptno from emp where (deptno, basicsal) in (select deptno, max(basicsal) from emp group by deptno)

32.What is the difference between TRUNCATE and DELETE commands?

Ans:Both will result in deleting all the rows in the table .TRUNCATE call cannot be rolled back as it is a DDL command and all memory space for that table is released back to the server. TRUNCATE is much faster.Whereas DELETE call is an DML command and can be rolled back.
33. How to find out the duplicate column
Ans: select column_name,count(*) from table_name having count(*)>1
if the result more than 1 then we can say that this column having duplicate records

34. How to find 2nd max salary from emp ?
Ans: select max(sal) from emp where sal not in(select max(sal) from emp)

35. How to find max salary department wise in emp table?
Ans:select deptno,max(sal) from emp group by deptno;
36. How to find 2nd max salary department wise in emp table?
select deptno,max(sal) from emp where (deptno,sal) not in(select deptno,max(sal) from emp group by deptno) group by deptno;

37. Table1 having 10 records and table2 having 10 records both tables having 5 matching records. then how many records will display in 1. equi join 2.left outer join 3. right outer join 4. full outer join
Ans: 1.in equi join matching records will display it means 5records will display
2.in left outer join matching 5 and non matching 5 records in left table so total 10 will display
3.
in right outer join matching 5 and non matching 5 records in right table so total 10 will display.

4. 
in full outer join matching 5 and non matching 5 records in left table and non matching records in right table so total 15 will display

38.
EMP table, for those emp whose Hiredate is same, update their sal by "sal+500" or else for others keep the sal as it is, how to do it by SQL query?

Ans:UPDATE emp SET sal=sal+500 WHERE hiredate IN (SELECT hiredate FROM employees HAVING COUNT(*)>1 GROUP BY hiredate)