Posted by : Arjun Lagisetty Wednesday, February 12, 2014

Ah! the classic tale of romance between the network and the app..

Problem: 

Long running DB session tasks from ODI are ending with a timeout error.

Diagnosis: 

When a long running task on the DB is kicked of from ODI. After a certain time ODI  session is ending with a timeout error. Irrespective of the status of the task kicked of on the db side. The culprit here was the firewall setting. Due to the corporate policies our firewall was configured to kill any "inactive" connections after an hour. Some of the DB tasks took more than hour, in this time there is no data movement between the agent and the database because these specific task are DML in nature and the processing is confined to the DB session itself. Since, there is no data movement firewall categorized this as an inactive connection and killed it.

Now comes the romance of the app and the network team. App blames network, network blames app and they live happily every after. Alas, only if it were true. Since, it was a corporate security policy, this rule could not be relaxed. It was the Apps responsibility to maintain an active connection. 

Solution: 

ODI inherently does not have any capability to ping the db at regular intervals once the session is kicked off. So, we had to shift this burden to our all time favorite hero the Oracle Database. Oracle SQL_NET profile can be configured with a parameter called SQLNET.EXPIRE_TIME, which is used to detect inactive connections by checking for an inactive client. By default this value is zero, if we assign a non-zero value (X) to this parameter Db server probes client every X minutes to ensure that the client is up. And the firewall thinks that the database is up because a signal is sent every X minute from the server to the client. And the network and the app live happily every after. :) 
I personally feel this solution is a hack and is very server specific (Oracle Db server). If you have a different database you should find out a similar parameters in your database. Though, it would be nice if oracle can build in this mechanism in the agent.




{ 6 comments ... read them below or Comment }

  1. Hi Admin,

    Nice blog. Thanks for sharing this.
    I am facing this problem and applied the solution above. But still the problem persist, I have done this setting at repository database level. Should i need to apply this setting at source and target databases? Please help.

    Thanks
    Bala

    ReplyDelete
  2. Yes, You should because the sessions are kicked off on the target or staging database and if the firewall terminates that connection, the complete signal does not reach agent and the session will be left as idle.

    ReplyDelete
  3. Hi,

    What is the exact error you are getting in this case?

    I have: ODI-1228: Task STG_EFD_DOC_FISCAL_HIST (Integração) fails on the target ORACLE connection AF_DBPRDDW_EFD.
    Caused By: java.sql.SQLRecoverableException: No more data to read from socket

    Could it be the same reason?

    ReplyDelete
  4. Looks like it. The tcp connection might be terminated. Wire shark should point out what exact problem is.

    ReplyDelete
  5. I´ve discovered that there are two sessions running in DB side. One keep interacting with database (inserts, updates, etc)

    Another one became without inactive doing anything...after 4 to 5 hours, the process hangs with Exceed maximum idle time.

    Do you know why it starts two sessions to the DB?

    ReplyDelete
  6. Not sure.. Check out the Action Column in the V$ tables for the session, check if which session number is present in the Action column.

    ReplyDelete

Popular Post

Labels

Blog Archive

Copyright © ODI Pundits - Oracle Data Integrator - Maintained by Arjun Lagisetty