SQL Server Transaction and Locking for beginners

What is Transaction?

Transaction groups set of tasks as one logical unit. A Transaction ensures that either whole group of tasks commits or whole group of tasks rollback.

If one of the commands in transaction fails, all of the commands fail and any data that was modified in database is rolled back.

Think about a bank transaction where money is sent from one account to other. So it has two tasks –

  1. Amount is deducted from the senders account.
  2. Amount is added to the receivers account.

Now both of these tasks should either complete successfully or they should get full rolled back anything partial will lead to issues.

Example of Transaction using commit and rollback command:

First I have created a database as “CustomerDB” and in that I have created a table as “Table_Customer”.

Now we will start the process of Transaction

To group set of task we need to use the syntax as

  1. “begin tran”
  2. Process database commands
  3. “commit tran”

Both of these inserts are now one logical unit.

Result after the query execution

Now to see the data present in table, execute the below query and see the results

I have kept ‘ID’ column as primary key, so let’s see what happens if I enter some duplicate records

It shows me primary key violation error when I tried to insert ‘ID’=1 again with different name.

As we have grouped the tasks in one logical unit (both the records ID=1 as well as ID=3) so the record of ‘ID’=3 should also be terminated when I execute the ‘Select’ query. Let’s execute ‘Select’ query and see the results.

We can see the record of ‘ID’=3 been displayed in the result set instead of using Transaction.

To overcome this problem we have to use a global variable as “@@error” with some condition as shown below

First it showed that 1 row effected (the record of ‘ID’=3 tries to get insert into table)

But because of duplicate record (‘ID’=1) the primary key violation error appeared and because of “rollback tran” command the record of ‘ID’=3 is also terminated.

Now let’s execute ‘select’ query to see the results and this time the record of ‘ID’=3 should also get terminated with ‘ID’=1

Now what if I insert ‘ID’=4 (no duplicate record)?

Now if I execute ‘select’ query both the records should get displayed.

So “@@error” which is a global variable informs us if there is error in this execution or not.

Nested Transaction, TranCount and Checkpoints:

Nested Transaction:

Nested Transaction means transaction inside a transaction.
The Inner Transaction is not physically committed while the Outer Transaction is physically committed.

TranCount:

TranCount shows the number of active transaction that have been made.

First we had 2 transaction so it showed number as 2,
Then 1 transaction got committed so it showed number as 1 and finally when last transaction got committed everything got committed and it showed number as 0.

Checkpoints:

Checkpoints are logical flags or logical completion points where someone wants to go to a specific transaction and revert back.

For example: When the transaction is happening we want to create a flag saying this is logical point 1, logical point 2, etc and then something happens wrong with the transaction, we can to rollback to a specific logical point.

Let’s see how we can do this practically as shown below:

begin tran t1
	insert into Table_Customer values (5,'Ajay')
	insert into Table_Customer values (6,'Nagesh')
	save tran l1					-- checkpoint 1
		begin tran t2
			insert into Table_Customer values(7,'Vikas')
			insert into Table_Customer values(8,'Shaam')
			save tran l2			-- checkpoint 2
			commit tran t2
		rollback tran l1
commit tran t1

As we can see above that we have given checkpoints to both transaction as l1 and l2 respectively.

The values that is been given to task(transaction) t2 will be terminated as I have given command as ‘rollback tran l1’. It will only store the values of task t1 when I execute the ‘select’ statement.

select * from Table_Customer

We can see from results it has only inserted the values of task t1 and because we have used ‘rollback tran l1’ it has terminated the values of task t2.

Concurrency:

When we are doing Transaction one more thing we need to handle is concurrency.

Concurrency is nothing but a situation where in two users are trying to access the same information and when they are accessing the information we do not want any kind of inconsistent results or abnormal behavior.

Let’s understand with an example shown below:

--User 1
begin tran 
	update Table_Customer set name = 'Arjun'
	where ID=5
	waitfor delay '00:00:10'
	update Table_Customer set name = 'Irappa'
	where ID=5
rollback tran

Here I have started a transaction where I am trying to update a data whose column ‘ID’ = 5

Now in this transaction I am trying to update ‘ID’=5 as ‘Arjun’ and at the same time I am trying to update ‘ID’=5 as ‘Irappa’.

Here the ‘waitfor delay’ command is used to make this transaction wait for some time. We have used 10 seconds as delay time.

-- User2
select * from Table_Customer

But if you see here I already have a data of ‘ID’=5 which has a value as Ajay.

Now in the above condition the user is updating ‘ID’=5 as ‘Arjun’ again the same transaction is updating ‘ID’=5 as ‘Irappa’ and after 10 seconds the ‘ID’=5 is updating ‘Arjun’ to ‘Irappa’ and then the whole transaction will rolled back.

Within 10 seconds if other user access the ‘ID’=5 the data he will get as ‘Arjun’, he does a refresh then he gets ‘ID’=5 as ‘Irappa’ and again he refreshes it then he gets the old value as ‘Ajay’. It is really confusing to end user as it is providing the inconsistent data or uncommitted data.

So in this case I don’t want the end user to see the inconsistent data or uncommitted data, I want him to see the final committed data.

Until the commit transaction does not happen I don’t want the user to access the data of ‘ID’= 5.

To solve this problem we can do a lock to column ‘ID’=5 exclusively, so that the user cannot access or modify the data of ‘ID’ = 5.

When above User 1 query is executed, it goes on a wait mode as we have given delay time as 10 seconds as shown below.

At the same time if we try to execute the ‘select’ query of user 2 it also goes on a wait mode as shown below.

-- User2
select * from Table_Customer
where ID=5

By default the exclusive lock is applied to ‘ID’ = 5 as the User 2 cannot access or modify the data until 10 seconds.

After 10 seconds it will display the old data as ‘Ajay’ as User 1 has used the rollback command.

-- User2
select * from Table_Customer
where ID=5

An important point to be notice is that this is a row level lock.

SQL Server Locking

Locking in SQL Server means to handle or to manage transaction concurrency.

It is designed to allow SQL Server to work with multi-user environment and it prevents records for being overwritten incorrectly.

\

The User2 is in read committed mode so he can only see the committed data. He is not willing to see the volatility of data that is happening between transactions but only wishes to see the data which is finally committed.

The User2 doesn’t want to be in blocking mode that is User2 wants to be in read uncommitted mode so that the User2 can sees the volatility of data happening between transactions but doesn’t want to go into blocking mode. So I have used ‘NoLock’ command to see the data between the transactions.

As the User1 executes the transaction and as he has set delay time as 30 seconds then after the user2 executes the select query he will first get the result in ‘ID’= 5 as ‘Arjun’ then after a 30 seconds refresh it will show ‘ID’=5 as ‘Irappa’ and after 30 seconds refresh it will show the old value of ‘ID’=5 as ‘Ajay’ because the data is rolled back as we have used ‘rollback tran’ command.

Let’s see how we do the above scenario practically.

First let’s check the data

-- User2
select * from Table_Customer
with (nolock)
where ID=5

User 1 has started the transaction

--User 1
begin tran 
	update Table_Customer set name = 'Arjun'
	where ID=5
	waitfor delay '00:00:30'
	update Table_Customer set name = 'Irappa'
	where ID=5
	waitfor delay '00:00:30'
rollback tran

Now User 2 immediately executes ‘select’ query, he gets the value of ‘ID’=5 as ‘Arjun’.

-- User2
select * from Table_Customer
with (nolock)
where ID=5

After 30 seconds User 2 does a refresh and he gets value of ‘ID’=5 as ‘Irappa’.

Again after 30 seconds refresh the User 2 gets back the old value of ‘ID’=5 as ‘Ajay’ because of “rollback tran” command by User 1

Now the User 2 is not in a blocking mode instead he is in a read uncommitted mode.

If we want to see which lock is applied on SQL server execute below queries

exec sp_lock
exec sp_who

Look at the first window where we have Mode as “X” which is exclusive lock.

Type as “KEY” which is nothing it is a row level.

spid as 58 which is a lock number.

If we want more details about the lock then use command as “sp_lock 58” (lock number as 58 and yes you can for any other number present in first window of the results)

Now at the second window we can see “cmd” column as WAITFOR, it’s because the user is waiting for execution time (delay time as 30 seconds) and in “status” column we have suspended (means it getting executed).

After the execution is done by User 1, the result of “sp_who 58” command will show “status” as sleeping as the transaction is done.

Note:-
X -> Exclusive lock.
Key -> Row level lock.
S -> Shared lock.
IX -> Intent Exclusive lock.
IS -> Intent Shared lock.

Deadlock:

Deadlock is a situation where in two process is block the progress of each other.

Process 1 has exclusive access to some resource and process 2 has exclusive access on some other resource. Now process 1 wants to get the access of resource for which process 2 has exclusive access and vice versa. So in this what happens is that both process block each other and goes into deadlock, then SQL Server has to take a call to kill one process and proceed with other process.

Let’s understand it practically

I have created two transaction ‘transaction 1’ and ‘transaction 2’ shown below.

--Transaction 1
begin tran 
	update Table_Customer set name = 'NewData'
	where ID=7
	waitfor delay '00:00:20'
	update Table_Customer set name = 'NewData'
	where ID=6
	commit tran

In the transaction 1 I have updated the data with ‘ID’= 7 and tries to get exclusive access to it and after 20 seconds I have updated another data with ‘ID’ = 6 and tries to get exclusive access to it.

--Transaction 2
begin tran 
	update Table_Customer set name = 'NewData'
	where ID=3
	waitfor delay '00:00:20'
	update Table_Customer set name = 'NewData'
	where ID=4
	
commit tran

In the transaction 2 I have updated the data with ‘ID’= 6 and tries to get exclusive access to it and after 20 seconds I have updated another data with ‘ID’ = 7 and tries to get exclusive access to it.

Now we will execute both the transaction and let’s see the results

Transaction 1 executing

Transaction 2 executing

Results of both after delay time of 20 seconds

Transaction 1

Transaction 2

In above scenario before the execution happened of both transactions update queries deadlock didn’t occurred but after the 20 seconds of execution of the queries that time deadlock occurred because the both transactions had exclusive access. When deadlock is occurred the SQL Server makes a call and kill one of the transaction that is transaction 2 and executes the transaction 1.

How to avoid Deadlock:

  1. Keep the transaction execution time as small as possible.
  2. When we have update data bunch of updates use chronological order.
  3. Update Lock.

Transaction 1

--Transaction 1
begin tran 
	update Table_Customer set name = 'SomeData'
	where ID=7
	waitfor delay '00:00:20'
	update Table_Customer set name = 'SomeData'
	where ID=6
	
commit tran

Transaction 2

--Transaction 2
begin tran 
	update Table_Customer set name = 'SomeData'
	where ID=7
	waitfor delay '00:00:20'
	update Table_Customer set name = 'SomeData'
	where ID=6
	
commit tran

In both the transaction I have kept ‘ID’ column value in chronological order.

After executing Transaction 1

After executing Transaction 2

The transaction 1 executes first but the transaction 2 waiting for execution. In this situation transaction 2 doesn’t go into the deadlock situation but it goes into blocking mode because the transaction 1 executes with exclusive access on ‘ID’=7 row. After the transaction 1 finish the work of updating the data then transaction 2 gets executed.

Transaction 2

In this situation one transaction waits for some time because another transaction is still doing his work.

Update Lock:

Before you are trying to do any update on data you would probably like to lock the record. In our case we would like to do update lock on ‘ID’ column so that no other user can update the data or record at a particular period of time.

Let’s understand this practically

I am trying to lock ID=7 and ID=6 so that nobody else can go and update this and I have executed both the transaction as shown below.

--Transaction 1
begin tran 
select * from Table_Customer with (updlock) where ID=7 or ID=6
	update Table_Customer set name = 'somedata1'
	where ID=7
	waitfor delay '00:00:20'
	update Table_Customer set name = 'somedata2'
	where ID=6
	
commit tran
--Transaction 2
begin tran 
	update Table_Customer set name = 'SomeData4'
	where ID=6
	waitfor delay '00:00:20'
	update Table_Customer set name = 'SomeData5'
	where ID=7
	
commit tran

Executing the query of Transaction 1

But after 20 seconds it will show the old value of ID=6 and ID=7 as ‘SomeData’ and ‘Somedata’ respectively but the values will get affected in Table_Customer.

Transaction 2 waits for the execution of Transaction 1.

After Transaction 1 gets executed, Transaction 2 will start it’s execution and here the Transaction 1 value as somedata2 and somedata1 of ID=6 and ID=7 respectively will be replaced and the update will get affected on Table_Customer with values ‘SomeData4’ and ‘SomeData5’ respectively.

In Update lock u can go and execute the ‘select’ statement but when statement enters in exclusive lock it goes in a read committed mode.

Let’s see this with an example

--Transaction 1
begin tran 
select * from Table_Customer with (updlock) where ID=7 or ID=6
	waitfor delay '00:00:20'
	update Table_Customer set name = 'somedata1'
	where ID=7
	waitfor delay '00:00:20'
	update Table_Customer set name = 'somedata2'
	where ID=6
	
commit tran

In Transaction 1 I have made some changes such as I have given delay time as 20 seconds after select statement and I execute this.

At the same time if I execute select statement it will get execute and will show the old values before the update statement gets executed as shown below.

select * from Table_Customer
where ID=6 or ID=7

If we do a refresh the update statement of ID=7 will get executed and the new value as ‘somedata1’ will be displayed as shown below.

As we can see ID=6 has the same old value because we have given delay time as 20 seconds after the update statement of ID=7.

Now if we refresh after 20 seconds the update statement of ID=6 will get executed and we will get the new value as ‘somedata2’ as shown below.

So Update Lock is very useful in minimizing the Dead Lock situation.

Shared Lock:

Shared lock has a very deep connection with update lock.

We can also say that existence of update lock is because of shared lock.

Shared locks are applied during the execution phase of ‘select’ query.

Shared locks are same as update lock but the only difference is log time duration.

Let’s practically understand the above statement

begin tran
--Shared Lock start here
select * from Table_Customer where id =6 or id=7
--Shared lock ended here
waitfor delay '00:00:10'
commit tran 

The duration of Shared lock is applied during the select execution phase even though if I start or commit the transaction while for Update lock the scope is right from the beginning of transaction to commit the transaction.

Whenever we use Shared Lock we cannot do any operations on data like delete, update, etc.

One important point about Shared Lock is that Update Lock can only be applied when we have Shared Lock and Exclusive lock can only be applied when we have Update Lock.

When we applied the Update Lock on select query at that time SQL server internally did is it applied a shared lock, after that it converted the Shared Lock into the Update Lock.

 select * from Table_Customer with (updlock) where ID=7 or ID=6

Then when we executed the update query, it converted the Update Lock to exclusive lock.

update Table_Customer set name = 'somedata1'
	where ID=7
	waitfor delay '00:00:20'

One more point that need to be noticed is that whenever we write update syntax the following things happens internally:

  • SQL Server has to do a select (Shared Lock happens).
  • The calculation is done (Shared Lock is converted to Update Lock).
  • Finally it will do the actual update (Exclusive lock happens).

When we fire the update query this time all the three locks do get executed one by one so that we do not have Concurrency problem and also get rid Dead Lock issue.

Lock Hierarchy:

All the three locks are not only applied on the row level but also applied on page level, table level, database level.

In locking hierarchy, at the bottom there is row level lock applied then page level lock is applied then table level lock is applied and finally database level lock is applied.

Intent lock:

Intent lock is used so that a transaction can inform other transaction what is the locking intention in the below hierarchy.

In intent lock, let’s say transaction 1 applies exclusive lock on a table and after that the other transaction need that table to do modifications on structure of table, at that time transaction 1 informs to other transaction that I have exclusive lock on that table so do not apply any changes on that table. After when transaction 1 will finish the task and release the lock then other transaction will have access to that table.

Isolation level is also a big part of Locking

Isolation level:

Isolation level defines how much isolated is one transaction from the other transaction.

In other words let’s say you have a transaction and that transaction is getting affected by other transaction which are doing updates, insert and delete. So probably you want to control don’t want to see the uncommitted updates nor the rows which are not committed and so on.

Basically you want to control that what kind of changes you want to see from the other transaction, that’s where the Isolation level is used.

Let’s understand it with an example

--Transaction 1
select * from Table_Customer where id =1000
waitfor delay '00:00:20'
select * from Table_Customer where id =1000

As u can see here I have two select statements where both are having ‘ID’=1000 and a delay time of 20 seconds in between and I executed this.

Now in this 20 seconds if other user updates the record of ‘ID’=1000 as ‘newvalue’ shown below, it will get executed instantly.

--Transaction 2
update Table_Customer set Name = 'newvalue' 
where id = 1000 

Now after 20 seconds of both select query execution, two values are visible as shown below

For 1st select query execution it’s showing as ‘oldvalue’ and for the 2nd select query execution it’s showing as ‘newvalue’.

The Transaction 1 is getting affected because of Transaction 2. So by using Isolation level you can define how much effect should be seen in Transaction 1.

Before proceeding to Isolation level let’s see the three issues happen when multiple transactions are executed.

    1. Repeatable Read : every time a transaction fetches a data, he sees a different data.

Example:

--Transaction 1
select * from Table_Customer where id =1000
waitfor delay '00:00:20'
select * from Table_Customer where id =1000

As u can see here I have two select statements where both are having ‘ID’=1000 and a delay time of 20 seconds in between and I executed this.

Now in this 20 seconds if other user updates the record of ‘ID’=1000 as ‘newvalue’ shown below, it will get executed instantly.

--Transaction 2
update Table_Customer set Name = 'newvalue' 
where id = 1000 

Now after 20 seconds of both select query execution, two values are visible as shown below

For 1st select query execution it’s showing as ‘oldvalue’ and for the 2nd select query execution it’s showing as ‘newvalue’.

The Transaction 1 is getting affected because of Transaction 2. So by using Isolation level you can define how much effect should be seen in Transaction 1.

So every time when Transaction 1 does a select, he always gets different data.

    1. Dirty read : Your transaction is selecting the data but the other transaction never commits the data. So your transaction is selecting uncommitted data.

Example:

--Transaction 1
select * from Table_Customer  with (nolock) where id =1000
waitfor delay '00:00:20'
select * from Table_Customer with (nolock) where id =1000

As u can see here I have two select statements where both are having ‘ID’=1000 and a delay time of 20 seconds in between and ‘nolock’ command is used to see the volatility happening between transaction (read uncommitted mode).

In Transaction 2, I have started a transaction and in between I have written an update query, then used ‘rollback tran’ command as shown below so it will get rolled back to old data after 20 seconds of select query execution.

--Transaction 2
begin tran
update Table_Customer set Name = 'testnew' 
where id = 1000
rollback tran 

Now when I execute the Transaction 1 query, In first select he will show the value as ‘testnew’ because of read uncommitted mode and in 2nd select he will show the value as ‘oldvalue’ because of ‘rollback tran’ command.

    1. Phantom read : when we do a select for first time the record is not visible but when we do a select for second time the record is visible to us.

Example:

--Transaction 1
select * from Table_Customer where id =7000
waitfor delay '00:00:20'
select * from Table_Customer where id =7000

As u can see here I have two select statements where both are having ‘ID’=7000 and a delay time of 20 seconds in between and I executed this.

As ‘ID’=7000 doesn’t exist on table so it should show me the result as blank or null.

But in this 20 seconds if other user inserts a record of ‘ID’=7000 as ‘testnewvalue’ shown below, it will get executed instantly.

--Transaction 2
insert into Table_Customer values(7000,'testnewvalue')

Now if we see the results of Transaction 1
At the 1st select the record is blank but when the 2nd select got executed the record came like a ghost.

Below fig shows that which issue can we fix with the help of following Isolation level :

In this table we can see that according to read committed mode we can fix the problem of dirty read issue and not the other two issues.

Let’s see with an example:

--Transaction 1
set transaction isolation level read committed
begin tran
select * from Table_Customer   where id =1000
waitfor delay '00:00:20'
select * from Table_Customer  where id =1000
commit tran

As u can see here I have two select statements where both are having ‘ID’=1000 and a delay time of 20 seconds in between and I executed this but this time at the top I have specified the isolation level as read committed mode to check whether it solves the Dirty read problem.

In Transaction 2, I have started a transaction and in between I have written an update query, then used ‘rollback tran’ command as shown below so it will get rolled back to old data after 20 seconds of select query execution.

   --Transaction 2
begin tran
update Table_Customer set Name = 'testnew' 
where id = 1000
rollback tran 

Now what I want is that after both select query execution in Transaction 1 it should not show me the data which is not committed (i.e. rolled back data) instead it should show me the old value as ‘somedata’ in both select query execution as shown below

So ‘read committed mode’ solves the problem of dirty read issue.

If we see the table carefully, we can see that when Isolation level is at Serializable mode it solves all the three issues that is dirty read, repeatable read and phantom read.

Let’s see with an example (resolving phantom read issue)

--Transaction 1
set transaction isolation level serializable
begin tran 
select * from Table_Customer where id = 7007
waitfor delay '00:00:20'
select * from Table_Customer where id = 7007
commit tran

In Transaction 1 I am trying to get the data of ‘ID’=7007 which is blank or not present in table and will do an execute.

 --Transaction 2
insert into Table_Customer values (7007,'ghostdata')

Now in Transaction 2, the user tries to insert a record in ‘ID’=7007 as shown above and execute it.

According to Phantom Read problem if the 1st select gets executed it shows nothing in the results but when the 2nd select gets executed the data in the result set appears like a Phantom.

But now after we have set transaction to Isolation level as Serializable, it should not show me any data in the result set as shown below

Serializable Isolation level helps you to avoid dirty read, repeatable read and phantom read.

It will minimize concurrency because it goes into exclusive lock.

Lock compatibility matrix diagram is shown below : –

This table helps you to show that if you have applied a lock then the other lock will work or not.

For example:

If we use a share lock will update lock work on it?

So according to table and with the above examples covered the answer is “yes”.

Now if you see this table there is something as Shared with Intent Lock (SIX).

It is nothing but conversion lock and including SIX we have SIU, UIX.

Conversion locks SIX, SIU and UIX:-

Conversion locks are resultant of converting one lock type to another.

Shared with Intent Exclusive (SIX):

A transaction which holds a Shared lock also has some pages or rows locked with an Exclusive lock.

Shared with Intent Update (SIU):

A transaction which holds a Shared lock also has some pages or rows locked with an Update lock.

Update with Intent Exclusive (UIX):

A transaction which holds an Update lock also has some pages or rows locked with an Exclusive lock.

ACID properties:

It is an acro name where A stands for Atomicity, C stands for Consistency, I stands for Isolation and D stands for Durability.

If you are implementing proper transaction and locks that means you are following all four ACID properties.

Atomicity: it is required when a transaction involves two or more discrete parts of information must commit all or none.

For example we have done begin tran, commit tran and rollback tran in above examples.

Consistency: it is required so that a transaction must create a valid state of new data, or it must roll back all data to the state that existed before the transaction was executed.

For example we have done begin tran, save tran(checkpoints), commit tran and rollback tran in above examples.

Isolation: it is required so that a transaction that is still running and did not commit all data yet, must stay isolated from all other transaction.

For this we have done transaction isolation level.

Durability: it is required so that committed data must be stored using method that will preserve all data in correct state and available to a user, even in case of failure.

For example mirroring, backup, replication, etc.

 

Posted in SQL Server, SQL server training, sql sever interview questions and answers, SQL training, Uncategorized | Tagged , , , , , | Leave a comment

Data Mining (Fundamentals & Time Series Algorithms)in SSAS

This short article is completely Q & A based on Data Mining in SSAS(SQL Server Analysis Services)where we will discuss fundamentals and Time Series Algorithms. So below are the questions and its proposed answer which will help reader to improve fundamentals of Data Mining in SSAS.

What is Data Mining?

  • Data Mining is nothing but some kind of logic or algorithm running on genuine historical data to bring out some kind of prediction.
  • So Data Mining means mining or hunting from the genuine historical data using some algorithms to do prediction, forecasting, etc. in an intelligent way.

Explain the term “training an algorithm”?

  • The term “training an algorithm” means that whatever algorithm we have with us is dead or not useful. Now to make that algorithm alive or useful that algorithm has to be trained.
  • Training an algorithm means that algorithm is running on a genuine historical data and as soon as it runs on the genuine data the algorithm starts working and puts some mind into it. By running on the data it will try to figure out what will be the future prediction of the data.
  • In short we say that to train an algorithm we have to make that algorithm to run on genuine data and this is how the algorithm gets trained.
  • Now trained means we are artificially putting some intelligence on a dead algorithm so that it can think in terms of that data to do forecasting, predictions, etc.

In what scenarios would we use Time Series Algorithm?

  • The Time Series Algorithm is used in scenarios where we want to do forecasting like about the annual sales, annual sales profit and various other forecasting.

What type of project will we select for Data Mining?

  • For Data Mining we will select our regular Multidimensional project as we used to select the project when we were doing SSAS.

For doing Data Mining Cube is compulsory?

  • Now usually it is not compulsory to have a Cube or create a Cube for doing Data Mining. But professionally in the industry for doing Data Mining Cube is necessary.
  • Because data coming from the Cube will be faster, so as a best industry practice Cube is necessary for Data Mining but it is not compulsory because Data Mining can also be done without Cube.

Explain Sequence Clustering Algorithm?

  • Sequence Clustering Algorithm is a combination of sequence analysis and clustering.
  • It identifies clusters of similarly ordered events in a sequence and these clusters can be used to predict events based on their characteristics.
  • For example there is a shop and some people will call first, then inquire and then go to the shop to purchase, some people see the advertisements and then go to the shop for purchasing & some people go directly to the shop for purchasing. So this algorithm will see which event of purchasing from the shop is most used and do analysis.

Explain Continuous & Key Time Data Types?

  • When we are configuring our mining structure there are two data types named Continuous & Key Time.
  • So Continuous means if we have Sales Amount so continuous data type will be like 100, 100.10, 100.20, 101, 101.5 etc.
  • And Key Time Type means suppose if we have Sales Year then the Key time data type will be like Saturday, Sunday ,Monday, Tuesday, etc. It is kind of a discreet data type.

What is Model in Data Mining?

  • Model is nothing but it is a thought process which will make us understand that how that thing will look like in real world or when it executes. For example if we want to build a car first we will build a model of a car.
  • Now in SSAS we have an algorithm and this algorithm is dead by itself so when this logic runs through the historical data it becomes alive or we can say get trained and have the thought process. And this thought process it uses to predict.
  • So Model means our Algorithm + our Historical Data. So basically it tells us how our algorithm has been trained and does the prediction in an intelligent way.
  • So Model = Algorithm getting trained using Historical data.

What is the query language for Data Mining?

  • The query language for Data Mining is called as DMX query language.

Explain Deviation in Time Series?

  • Deviation in Time Series means at what point the algorithm starting changing the trend or it started making movements.

Our algorithm is not showing the proper trend what can be the reason?

  • The reason for algorithm not showing the proper trend is due to the bad data or we can say not so genuine historical data.

Where does data mining model gets deployed?

  • The data mining model gets deployed in our analysis services in our SQL Server Management Studio.

Can we deploy data mining in tabular?

  • No, we cannot deploy our data mining in tabular.

Explain Predict Function in DMX?

  • Predict function in DMX is nothing but a kind of function used to do prediction on particular column or table using DMX queries.
  • Also this predict function says that what column reference to give and what are the number of items we want on which we want to do prediction. We can also specify the start and the end value so that the prediction can happen in between them.

If you are now feeling bored with huge dose of theory below is the practical SSIS project video which will refresh you and go energetic: –

Posted in MSBI Interview Question, Uncategorized | Tagged , , , , , , , | Leave a comment

MSBI Interview Question’s:- Part 1

We have come up with few real-time questions asked to one of the fresher candidate during MSBI(SSIS, SSAS and SSRS) interview questions if you are one of those who is seeking for the same then below questions and answers will surely help you.

1. Is SQL Server required for MSBI Development?

  • Yes, SQL Server is required for MSBI Development.

2. What components did you use in SSIS package development?

  • We did use many components is SSIS like Conditional Split, Merge, Merge Join, Sort, SCD, CDC, Data Conversion, Execute Task component, Script Task, etc.

3. What is the difference between Merge & Union All?

  • They are both transformations of SSIS. The first difference between them is that Merge can only accept two datasets as input while Union All can accept multiple datasets as input.
  • The second main difference is that Merge requires both datasets to be sorted while Union All does not require sorting of the datasets.

4. What is an MDX query and give its sample query?

  • MDX Query is used to retrieve information stored in an OLAP Cube. It gives us the Multidimensional view of the data. SQL Query can handle only 2 dimensions in a query for processing tabular data whereas MDX Query can process more dimensions in a Query.
  • Example of MDX Query, “Select From [OLAPCubeName];”,

“SelectDimension.Member on Column From [Cube Name];”

5. What do you mean by SCD and explain its types?

  • SCD means Slowly Changing Dimensions. As the name says it is used where our dimensions are not changing that frequently. These dimensions can be our master tables like Country master, Currency master. We know that the countries or currencies do not change frequently so therefore these dimensions are termed as Slowly Changing Dimensions. Now SCD is of three types Type 0, Type 1 & Type 2.
  • Type 0 is known as Fixed Attribute. It means that whenever any changes are detected, it will consider those changes as errors.
  • Type 1 is known as Changing Attribute. It means that whenever any changes are done it will overwrite the new value on the old value.
  • Type 2 is known as Historical Attribute. It means that whenever any changes are done, it will show us both the old value and the new value.

6. How to generate a Cube is SSAS?

  • Cubes are multidimensional database which stores data to do analysis process. A Cube contains a Fact or measure in its center and Dimensions all around it means dimensions act as sides of the cube.
  • So when we create a SSAS project according the project structure first we have to provide the source from where our data or Facts & Dimensions will come. As soon as we do that, in our Data Source Views we see that our SSAS Database diagram is created. Here if we want we can change the relationship also. Remember that our SSAS database and our SQL Database are not linked, it means that any changes we do in our SSAS database it will not affect our SQL Database. For example, we change our relations and our Database diagram of SSAS Database, so it will not change the relations or the database diagram of our SQL Database.
  • After that there is a folder of Cube where on right click comes an option to create cube. Once we select it there will be a wizard opened where we have to configure the cube and follow the wizard step by step and our cube is generated. Inside the wizard it will ask us which Fact and which dimensions we want in our Cube.
  • In this way the Cube will be generated.

7. What are the different project deployment methods is SSIS?

  • There are basically two methods of project deployment in SSIS, they are Project deployment and Package deployment.
  • In project deployment the project is deployed in one go and this technique is preferred way of deploying SSIS project. It was introduced from year 2012. By this method we can deploy our project in SQL Server, MSDB & File System.
  • In package deployment individual package is deployed. This method is used when the whole project is deployed and after that there are some changes to be done on a package or new packages are to be added at that time we use this package deployment method. This method was active till 2008 and with this method we can deploy our project in our MSDB & File System and not in our SQL Server.

8. What do we mean by Primary Key & Foreign Key in SQL Table?

  • Primary & Foreign Key concepts are very important in creating a SQL Table. Primary key means it is a key given to that column which will be unique and non-repeating. Usually Primary key is given to the numbers or Id’s in a table.
  • Foreign Key means it is a key which is related to the primary key. There can be no foreign key without primary key. A table can have many foreign keys but it can have only one primary key.

9. What are the different joins in SQL?

  • There are different types of Joins which are there in SQL when we want to join the records of two or more tables.
  • These joins are Inner Join, Left Outer Join, Right Outer Join, and Full Outer Join.

10. What is the query syntax of all the different joins in SQL?

  • Inner Join:

Select * From Employeetable inner join Departmenttable

ON Employeetable = Departmenttable;

  • Left Outer Join:

Select * From Employeetable left join Departmenttable

ON Employeetable = Departmenttable;

  • Right Outer Join:

Select * From Employeetable right join Departmenttable

ON Employeetable = Departmenttable;

  • Full Outer Join:

Select * From Employeetable full outer join Departmenttable

ON Employeetable = Departmenttable;

Above given questions and answers will serve as last minute revision on MSBI topic or will help as manual to prepare for interview. If you want more learning on MSBI topic will
practical full hands-on MSBI training in Mumbai so if you are from Mumbai city then visit us it will surely help to shape your MSBI career.

If you are not from Mumbai then learn MSBI through self-study learning MSBI project videos series especially for distance learners which has turn-out be good substitute of offline learning. Below is first video which starts with SSIS from that project series: –

Posted in MSBI, MSBI Interview Question, Uncategorized | Tagged , , , , , | Leave a comment

How to perform Script Task and send email using Script Task?

Script Tasks are used to send an email or do any web related tasks where customization is required. In Script Task for doing the customization we have to write C# code. It would be helpful for the MSBI training candidate if they have earlier worked with Visual Studio as writing C# code would require this tool to perform this lab easily.

So first of all we have to create a new package and inside that package we have to drag and drop the Script Task component from SSIS Toolbox.

1

After doing that we have to just double click on the Script Task component to configure it.

2

Now to write the C# code we have to click on option Edit Script and then a new Visual Studio window will open where we will write our C# code.

3

Now we will write the C# code.

4

This is how our Visual Studio window will look like where we have to write the code.

5

publicvoid Main()
		{
// TODO: Add your code here

// Step 1 :  Configure SMTP..
SmtpClient client = newSmtpClient();
client.Host = "smtp.gmail.com";
client.Port = 587;
client.EnableSsl = true;

//Step 2 : Provide Credentials..
string Username = "xyz123@gmail.com";
string Password = "xyz123";

NetworkCredential credential = newNetworkCredential(Username,Password);
client.Credentials = credential;

//Step 3: Configure from address..
MailAddressFromAddress = newMailAddress(Username, "John Doe");

//Step 4 : Create message to be sent..
MailMessage message = newMailMessage();
message.From = FromAddress;
message.To.Add("abc@gmail.com");
message.Subject = "SSIS Script Task";
message.IsBodyHtml = true;
message.Body = "Hello World from MSBI";

//Step 5: Send Email..
client.Send(message);

			Dts.TaskResult = (int)ScriptResults.Success;
		}

This is the C# code written step by step. No worries if candidate of MSBI traininghas never worked with C#,faculty helps to learn, understand and execute this lab.

After writing the code, we will close the window and click OK on our Script Task configuration dialog box.

6

After this done save the Script Task component and then execute the package.

In this way we can send email using Script Task component.

Also get to see following video on MSBI project series which start from the basic to start SSIS : –

Posted in MSBI, Uncategorized | Tagged , , , , , , | Leave a comment

3 things to avoid during Job Interviews By Questpond.

First and foremost avoid the below three things:-

Avoid saying negative about your previous organization because that would send wrong message to the interviewer. Do not create a pessimistic image before the interviewer.On the contrary if you start with positive things like :- “The current company has no issues and I am extremely happy with the current environment but it’s more of a personal problem etc etc…” that would create a very strong and positive image of yours.

Avoid complaining about your current work profile like “It’s a maintenance project, it’s in 1.1 version and I am not learning anything new etc.”. It’s very much possible that the interviewer has a maintenance job. So in such kind of situation’s I have seen the interviewer rejecting people irrespective even though if they have performed well in technical round.

Avoid saying you are jumping only because of salary. Some of the interviewer’s do not like people who just look at monetary benefits. When you are asked about salary expectation these things could be better discussed during that conversation. Do not make it explicit during the interview. Bet me nobody likes greedy professionals:).

Below are some of the good reasons for which the interviewer does not cross question back:-

  • For better prospects and growth.
  • Office is far away I spend lot of time travelling and would like minimize my travelling time.
  • They are telling me to relocate but in my current situation it’s not viable.
  • I am developer but due to project requirement I am in a testing role and so want to switch.
  • Project is getting closed and the company does not have projects.
  • Company is not paying salary on time.
Posted in Uncategorized | Tagged , , , , , , , , | Leave a comment

QuestPond’s Interview Questions & Answers on (SSIS, SSAS and SSRS)MSBI

What are partitions in SSAS?

SSAS let us create cube. Cube is a multi-dimensional database. Data will be stored inside cube as dimensions and Fact Tables.
By default in order to store Fact tables (measure group tables) partitions will be created inside cube.
Be default for one fact table one partition will be created. Partition is simply a physical storage unit inside cube.

Note: End user is nothing to do with partition. Her/she never say, I want data from this or that partition.
For him/her it is always going to be single partition.

Can we span single fact table across multiple partition?

Yes, we can. We can create more than one partition for a single fact table.

Is there anything we have to keep in mind while creating multiple partitions for single fact table?

Yes, we have to make sure that each partition contain unique records.

Advantages of having partition:

  • Each partition can be stored inside a separate physical drive bringing parallel data access into picture.
  • Each partition can have its own aggregation logic.
  • Different storage setting can be defined for each partition. Example – one partition supports MOLAP whereas one supports ROLAP
  • Each partition can be processed independently
  • By keeping historical data and a new data of a particular method in separate partitions processing speed can be improved.

Does partitions store only method group data?

No, in case of aggregating logic is defined, it also stores aggregated data.

Also refer our article on 8 important SQL Server (SSIS, SSAS and SSRS)MSBI Interview questions

For technical trainings on various topics like WCF (Windows Communication Foundation), MVC (Model View Controller), Business Intelligence, Design Patterns, WPF, TFS and Basic fundamentals feel free to contact SukeshMarla@Gmail.com or visit www.sukesh-marla.com

For more stuff like this, click here. Subscribe to article updates or follow at twitter @SukeshMarla

See practical video on explaining SSIS, SSAS and SSRS (part 1) with sample demo: –

Posted in Uncategorized | Tagged , , , , , , , , | Leave a comment

QuestPond’s Interview Questions & Answers on ASP.NET MVC (Model View Controller)

Why MVC (Model View Controller) is loosely coupled?

In order to answer this question, first let’s understand what’s the difference between loosely coupling and tightly coupling is.

Look at the following pic.

p1

When we talk about human body, body parts are tightly coupled to each other. We cannot simply replace hand with some other hand or a leg with some other leg. This is called tightly coupling.
When we talk about computer most of the parts are loosely coupled. Take the RAM out put a new one, same with display, keypad etc. This is called loosely coupling.

In Web Forms User interaction logic is handled inside CodeBehind where as in Asp.net MVC it’s handled inside Controller. Now if we compare Web form Codebehind with MVC Controller biggest difference is, CodeBehind is tightly coupled with view where as controller wont tightly connected to any view.
Single controller can work with more than one view and similarly one view may contain references for more than one controller in form or anchor tag.

p2

For technical trainings on various topics like WCF (Windows Communication Foundation), MVC (Model View Controller), Business Intelligence, Design Patterns, WPF, TFS and Basic fundamentals feel free to contact SukeshMarla@Gmail.com or visit www.sukesh-marla.com

For more stuff like this, click here. Subscribe to article updates or follow at twitter @SukeshMarla

See the following MVC (Model View Controller) video on viewdata, viewbag, tempdata & session:-

 

Click and see here for more step by step training in ASP.NET MVC

 

 

 

 

Posted in Uncategorized | Tagged , , , , , , , | Leave a comment

QuestPond’s Interview Questions & Answers on ADO.NET

What is the use of Command objects?

Command object helps to execute SQL statements. Following are the methods provided by command object:-

  • ExecuteNonQuery: – Executes insert, update and delete SQL commands. Returns an Integer indicating the number of rows affected by the query.
  • ExecuteReader: – Executes select SQL statements which can either be in your .NET code or in a stored procedure. Returns a “Datareader” object.
  • ExecuteScalar: – Executes SQL command and returns only a single value like count,sum , first record etc.

How can we fine-tune the command object when we are expecting a single row?

Again, CommandBehaviour enumeration provides two values Single Result and Single Row. If you are expecting a single value then pass “CommandBehaviour.SingleResult” and the query is optimized accordingly, if you are expecting single row then pass “CommandBehaviour.SingleRow” and query is optimized according to single row.

What are Dataset objects?

Dataset is an in memory object with data tables, rows and columns. You can visualize it as in-memory RDBMS.Dataset has the following features:-

  • The in memory RDBMS works in a disconnected manner. In other words even if the connection is closed the dataset is still there in memory.
  • You can do modification in the in-memory database object and send the final changes to the database.

Below is a simple code snippet which shows how to access a column value. You can see how the full dataset object hierarchy is accessed to get the column value.

objDataset.Tables[0].Rows[0]["CustCode"]

dataset

How can we force the connection object to close after my data reader is closed?

Command method Execute reader takes a parameter called as Command Behavior wherein we can specify saying close connection automatically after the Data reader is close.

PobjDataReader = pobjCommand.ExecuteReader (CommandBehavior.Close
Connection)

See the following video on ADO.NET connection pooling: –

 

Click and see here for more training on ADO.NET

Posted in Uncategorized | Tagged , , , , , , , , | Leave a comment

QuestPond’s Interview Questions & Answers on .NET

What is a Class and structure’s?

Structures are value types and classes are reference types..

When to use Structures and When to use classes?

You will use structures when: –

Point 1:- If you want to represent a custom value type. This custom value type is derived from primitive data types (int, double). Some of the example’s of custom types are co-ordinates (which have X, Y), complex numbers (which have real and imaginary components). You can also term these things as value objects or technical objects. Technical objects do not represent real world objects like customer, supplier, invoice etc.

Point 2:- If you want to have low memory foot print. For instance let’s say you want to plot a graph. To plot a graph you need to have 100’s of objects created so that you can represent co-ordinates. Now if you create a class and generate those 100’s of objects you end up putting lot of load on your garbage collector. If you create a “struct” , it’s a value type. So they get created and destroyed immediately. Thus putting less load on memory.
For all other scenarios use a class.

datatype

 

What are similarities between Class and structure?

Following are the similarities between classes and structures:-

  • Both can have constructors, methods, properties, fields, constants, enumerations, events, and event handlers.
  • Structures and classes can implement interface.
  • Both of them can have constructors with and without parameter.
  • Both can have delegates and events.

What is the difference between Class and structure’s?

Following are the key differences between them:-

  • Structures are value types and classes are reference types. So structures use stack and classes use heap.
  • Structures members cannot be declared as protected, but class members can be. You cannot do inheritance in structures.
  • Structures do not require constructors while classes require.
  • Objects created from classes are terminated using Garbage collector. Structures are not destroyed using GC.

Click and see here for more basics training in .NET


 

Posted in Uncategorized | Tagged , , , , , , , | Leave a comment

QuestPond’s Interview Questions & Answers on ADO.NET

How is stored procedure different from functions?

  • Function cannot affect the state of the database which means we cannot perform CRUD operation on the database. Stored Procedure can affect the state of the database by using CRUD operations.
  • Store Procedure can return zero or n values whereas Function can return only one value.
  • Store Procedure can have input, output parameters for it whereas functions can have only input parameters.
  • Function can be called from Stored Procedure whereas Stored Procedure cannot be called from Function

How do we use stored procedure in ADO.NET and how do we provide parameters to the stored procedures?

ADO.NET provides the SqlCommand object, which provides the functionality of executing stored procedures. In the command type we need to provide the command type as stored procedure as shown in the below code snippet.

SqlCommand objCommand = new SqlCommand("sp_Insert", objConnection
;
objCommand.CommandType = CommandType.StoredProcedure;
objCommand.ExecuteNonQuery();

See following video on dataset is a disconnected while datareader is connected: –

 

Click and see here for more ADO.NET training.

Posted in Uncategorized | Tagged , , , , , , , , , | Leave a comment