Quantcast
Channel: Donghua's Blog - DBAGlobe
Viewing all 604 articles
Browse latest View live

SQL Server 2016 AG Setup Part 5–AG2 between SQL01\AGINST2 and SQL02\AGINST2

$
0
0

During this example, same port has been used for as part of illustration. Different port number between AG and instance is not a requirement, it’s a best practice instead as one instance can support multiple AG groups.

InkedVirtualBox_PEGAAD_15_07_2017_15_03_22_LI

VirtualBox_PEGAAD_15_07_2017_15_03_29VirtualBox_PEGAAD_15_07_2017_15_04_01

Important point: Since we already have AG1 running with different SQL instances on port 5022, we need to choose different port here to avoid conflict.

VirtualBox_PEGAAD_15_07_2017_15_05_56VirtualBox_PEGAAD_15_07_2017_15_04_18VirtualBox_PEGAAD_15_07_2017_15_04_46VirtualBox_PEGAAD_15_07_2017_15_04_55VirtualBox_PEGAAD_15_07_2017_15_05_02VirtualBox_PEGAAD_15_07_2017_15_06_08VirtualBox_PEGAAD_15_07_2017_15_07_15

Make sure SQL01\AGINST2 and SQL02\AGINST2 only listen on specific IP, instead of “LISTEN ALL”, that will block AG Listener binding to same port.

VirtualBox_SQL01_15_07_2017_15_15_37

VirtualBox_SQL01_15_07_2017_15_15_32

VirtualBox_SQL01_15_07_2017_15_18_38VirtualBox_SQL01_15_07_2017_15_30_10

Otherwise connection to AG Listener won’t be successfully.

VirtualBox_PEGAAD_15_07_2017_15_11_33

VirtualBox_SQL01_15_07_2017_15_11_23


SQL Server 2016 AG Setup Part 6–AG3 between SQL01\AGINST2 and SQL02\AGINST2

Use standard logging for audit purpose in PostgreSQL 9.6

$
0
0

Configuration:

$ grep -i audit postgresql.conf
log_connections = on  # audit setting
log_disconnections = on # audit setting
log_line_prefix = '<%m:%r:%u@%d:[%p]:> '        # audit setting
log_statement = 'all'                   # audit setting

Sample pg_log output:

<2017-07-18 20:25:12.374 +08:127.0.0.1(57640):[unknown]@[unknown]:[3541]:> LOG:  connection received: host=127.0.0.1 port=57640
< 2017-07-18 20:25:12.375 +08:127.0.0.1(57640):admin1@testdb:[3541]:> LOG:  connection authorized: user=admin1 database=testdb
< 2017-07-18 20:25:13.037 +08:127.0.0.1(57640):admin1@testdb:[3541]:> LOG:  disconnection: session time: 0:00:00.662 user=admin1 database=testdb host=127.0.0.1 port=57640
< 2017-07-18 20:25:17.622 +08:127.0.0.1(57642):[unknown]@[unknown]:[3543]:> LOG:  connection received: host=127.0.0.1 port=57642
< 2017-07-18 20:25:17.623 +08:127.0.0.1(57642):admin1@testdb:[3543]:> LOG:  connection authorized: user=admin1 database=testdb
< 2017-07-18 20:25:32.728 +08:127.0.0.1(57642):admin1@testdb:[3543]:> LOG:  statement: create table t1(id integer);
< 2017-07-18 20:25:41.154 +08:127.0.0.1(57642):admin1@testdb:[3543]:> LOG:  statement: insert into t1 values (1);
< 2017-07-18 20:25:47.598 +08:127.0.0.1(57642):admin1@testdb:[3543]:> LOG:  statement: select * from t1;
< 2017-07-18 20:25:50.049 +08:127.0.0.1(57642):admin1@testdb:[3543]:> LOG:  statement: drop table t1;
< 2017-07-18 20:25:54.762 +08:127.0.0.1(57642):admin1@testdb:[3543]:> LOG:  disconnection: session time: 0:00:37.139 user=admin1 database=testdb host=127.0.0.1 port=57642

image

Install and Configure PGAUDIT in PostgreSQL 9.6 step by step

$
0
0

Install prerequisites

# yum install readline readline-devel zlib zlib-devel bison bison-devel flex flex-devel

Clone the PostgreSQL repository:

git clone https://github.com/postgres/postgres.git

Checkout REL9_6_STABLE branch:

cd postgres

git checkout REL9_6_STABLE

Make PostgreSQL:

./configure --enable-debug --prefix=/var/lib/pgsql/pgsql_latest/ --with-pgport=5555
make install -s

Change to the contrib directory:

cd contrib

Clone the pgAudit extension:

git clone https://github.com/pgaudit/pgaudit.git

Change to pgAudit directory:

cd pgaudit

Build pgAudit and run regression tests:

make -s check

============== creating temporary instance            ==============
============== initializing database system           ==============
============== starting postmaster                    ==============
running on port 57835 with PID 17530
============== creating database "contrib_regression" ==============
CREATE DATABASE
ALTER DATABASE
============== running regression test queries        ==============
test pgaudit                  ... ok
============== shutting down postmaster               ==============
============== removing temporary instance            ==============

=====================
  All 1 tests passed.
=====================

Install pgAudit:

make install

/bin/mkdir -p '/var/lib/pgsql/pgsql_latest/lib'
/bin/mkdir -p '/var/lib/pgsql/pgsql_latest/share/extension'
/bin/mkdir -p '/var/lib/pgsql/pgsql_latest/share/extension'
/bin/install -c -m 755  pgaudit.so '/var/lib/pgsql/pgsql_latest/lib/pgaudit.so'
/bin/install -c -m 644 ./pgaudit.control '/var/lib/pgsql/pgsql_latest/share/extension/'
/bin/install -c -m 644 ./pgaudit--1.1.1.sql ./pgaudit--1.0--1.1.1.sql  '/var/lib/pgsql/pgsql_latest/share/extension/'

Configure Parameter:

$ grep -i audit postgresql.conf
shared_preload_libraries = 'pgaudit'
pgaudit.log = 'all, -misc'
log_connections = on  # audit setting
log_disconnections = on # audit setting
log_line_prefix = '<%m:%r:%u@%d:[%p]:> '        # audit setting
log_statement = 'none'                  # audit setting

Startup Log:

$ /var/lib/pgsql/pgsql_latest/bin/pg_ctl start -D /var/lib/pgsql/9.6/data
server starting
<2017-07-18 22:10:11.455 +08::@:[17758]:> LOG:  pgaudit extension initialized
< 2017-07-18 22:10:11.470 +08::@:[17758]:> LOG:  redirecting log output to logging collector process
< 2017-07-18 22:10:11.470 +08::@:[17758]:> HINT:  Future log output will appear in directory "pg_log".

Sample Output:

<2017-07-18 22:12:05.776 +08:127.0.0.1(54486):[unknown]@[unknown]:[17804]:> LOG:  connection received: host=127.0.0.1 port=54486
< 2017-07-18 22:12:12.429 +08:127.0.0.1(54488):[unknown]@[unknown]:[17807]:> LOG:  connection received: host=127.0.0.1 port=54488
< 2017-07-18 22:12:12.430 +08:127.0.0.1(54488):admin1@testdb:[17807]:> LOG:  connection authorized: user=admin1 database=testdb
< 2017-07-18 22:12:37.644 +08:127.0.0.1(54488):admin1@testdb:[17807]:> LOG:  AUDIT: SESSION,1,1,DDL,CREATE TABLE,,,create table t1(i integer);,<not logged>
< 2017-07-18 22:13:09.207 +08:127.0.0.1(54488):admin1@testdb:[17807]:> LOG: AUDIT: SESSION,2,1,WRITE,INSERT,,,insert into t1 values (1);,<not logged>
< 2017-07-18 22:13:13.911 +08:127.0.0.1(54488):admin1@testdb:[17807]:> LOG:  AUDIT: SESSION,3,1,READ,SELECT,,,select * from t1;,<not logged>
< 2017-07-18 22:13:15.232 +08:127.0.0.1(54488):admin1@testdb:[17807]:> LOG:  AUDIT: SESSION,4,1,READ,SELECT,,,select * from t1;,<not logged>
< 2017-07-18 22:13:52.766 +08:127.0.0.1(54488):admin1@testdb:[17807]:> ERROR:  column "id2" does not exist at character 8
< 2017-07-18 22:13:52.766 +08:127.0.0.1(54488):admin1@testdb:[17807]:> STATEMENT:  select id2 from t1;
< 2017-07-18 22:14:13.596 +08:127.0.0.1(54488):admin1@testdb:[17807]:> LOG:  AUDIT: SESSION,5,1,WRITE,DELETE,,,delete from t1;,<not logged>
< 2017-07-18 22:14:23.391 +08:127.0.0.1(54488):admin1@testdb:[17807]:> LOG:  AUDIT: SESSION,6,1,WRITE,TRUNCATE TABLE,,,truncate table t1;,<not logged>
< 2017-07-18 22:14:26.746 +08:127.0.0.1(54488):admin1@testdb:[17807]:> LOG:  AUDIT: SESSION,7,1,DDL,DROP TABLE,,,drop table t1;,<not logged>
< 2017-07-18 22:14:29.103 +08:127.0.0.1(54488):admin1@testdb:[17807]:> LOG:  disconnection: session time: 0:02:16.674 user=admin1 database=testdb host=127.0.0.1 port=54488

image

Add firewall rules in Redhat EL7 for Oracle database

$
0
0
[root@vmxdb01 ~]# firewall-cmd --permanent --zone=public --add-port=1521/tcp
success

[root@vmxdb01 ~]# firewall-cmd --reload
success

[root@vmxdb01 ~]# firewall-cmd --permanent --zone=public --list-ports
1521/tcp

[root@vmxdb01 ~]# firewall-cmd --list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: enp0s3
  sources:
  services: dhcpv6-client ssh
  ports: 1521/tcp
  protocols:
  masquerade: no
  forward-ports:
  sourceports:
  icmp-blocks:
  rich rules:

  

Verification of SQL Server JDBC integrated authentication with sample script

$
0
0
SET JAVA_HOME="C:\Program Files\Java\jdk1.8.0_131"

SET CLASSPATH=.;C:\temp\sqljdbc_6.2\enu\mssql-jdbc-6.2.1.jre8.jar


C:\temp\sqljdbc_6.2\enu\samples\connections>%JAVA_HOME%\bin\javac connectURL.java

SET CLASSPATH=.;C:\temp\sqljdbc_6.2\enu\mssql-jdbc-6.2.1.jre8.jar;C:\temp\sqljdbc_6.2\enu\auth\x64


C:\temp\sqljdbc_6.2\enu\samples\connections>%JAVA_HOME%\bin\java connectURL
Aug 04, 2017 9:48:25 AM com.microsoft.sqlserver.jdbc.AuthenticationJNI
WARNING: Failed to load the sqljdbc_auth.dll cause : no sqljdbc_auth in java.library.path
com.microsoft.sqlserver.jdbc.SQLServerException: This driver is not configured for integrated authentication. ClientConnectionId:9f25a766-3663-4bc5-b68c-19a551cbcd20
        at com.microsoft.sqlserver.jdbc.SQLServerConnection.terminate(SQLServerConnection.java:2435)
        at com.microsoft.sqlserver.jdbc.AuthenticationJNI.(AuthenticationJNI.java:75)
        at com.microsoft.sqlserver.jdbc.SQLServerConnection.logon(SQLServerConnection.java:3129)
        at com.microsoft.sqlserver.jdbc.SQLServerConnection.access$100(SQLServerConnection.java:82)
        at com.microsoft.sqlserver.jdbc.SQLServerConnection$LogonCommand.doExecute(SQLServerConnection.java:3121)
        at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7151)
        at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:2478)
        at com.microsoft.sqlserver.jdbc.SQLServerConnection.connectHelper(SQLServerConnection.java:2026)
        at com.microsoft.sqlserver.jdbc.SQLServerConnection.login(SQLServerConnection.java:1687)
        at com.microsoft.sqlserver.jdbc.SQLServerConnection.connectInternal(SQLServerConnection.java:1528)
        at com.microsoft.sqlserver.jdbc.SQLServerConnection.connect(SQLServerConnection.java:866)
        at com.microsoft.sqlserver.jdbc.SQLServerDriver.connect(SQLServerDriver.java:569)
        at java.sql.DriverManager.getConnection(DriverManager.java:664)
        at java.sql.DriverManager.getConnection(DriverManager.java:270)
        at connectURL.main(connectURL.java:43)
Caused by: java.lang.UnsatisfiedLinkError: no sqljdbc_auth in java.library.path
        at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
        at java.lang.Runtime.loadLibrary0(Runtime.java:870)
        at java.lang.System.loadLibrary(System.java:1122)
        at com.microsoft.sqlserver.jdbc.AuthenticationJNI.(AuthenticationJNI.java:50)
        ... 13 more


C:\temp\sqljdbc_6.2\enu\samples\connections>dir "C:\temp\sqljdbc_6.2\enu\auth\x6
4"
 Volume in drive C has no label.
 Volume Serial Number is D0AD-1A2D

 Directory of C:\temp\sqljdbc_6.2\enu\auth\x64

08/04/2017  09:38 AM    
         .
08/04/2017  09:38 AM    
         ..
07/14/2017  11:41 AM           310,480 sqljdbc_auth.dll
               1 File(s)        310,480 bytes
               2 Dir(s)  58,810,060,800 bytes free

C:\temp\sqljdbc_6.2\enu\samples\connections>SET PATH=%PATH%;C:\temp\sqljdbc_6\enu\auth\x64

C:\temp\sqljdbc_6.2\enu\samples\connections>%JAVA_HOME%\bin\java connectURL
1 Accounting Manager
2 Assistant Sales Agent
3 Assistant Sales Representative
4 Coordinator Foreign Markets
5 Export Administrator
6 International Marketing Manager
7 Marketing Assistant
8 Marketing Manager
9 Marketing Representative
10 Order Administrator




=========================== connectURL.java ===========================
import java.sql.*;

public class connectURL {

public static void main(String[] args) {

// Create a variable for the connection string.
String connectionUrl = "jdbc:sqlserver://WIN-L2D9O5BHNHA:2433;" +
"databaseName=AdventureWorks2012;integratedSecurity=true;";

/* 
// Below connection string using instance name for named instance, required Browser
// Service up and running
String connectionUrl = "jdbc:sqlserver://WIN-L2D9O5BHNHA;" +
"instanceName=PROD;databaseName=AdventureWorks2012;integratedSecurity=true;";
*/

// Declare the JDBC objects.
Connection con = null;
Statement stmt = null;
ResultSet rs = null;

        try {
        // Establish the connection.
        Class.forName("com.microsoft.sqlserver.jdbc.SQLServerDriver");
            con = DriverManager.getConnection(connectionUrl);
            
            // Create and execute an SQL statement that returns some data.
            String SQL = "SELECT TOP 10 * FROM Person.ContactType";
            stmt = con.createStatement();
            rs = stmt.executeQuery(SQL);
            
            // Iterate through the data in the result set and display it.
            while (rs.next()) {
            System.out.println(rs.getString(1) + "" + rs.getString(2));
            }
        }
        
// Handle any errors that may have occurred.
catch (Exception e) {
e.printStackTrace();
}

finally {
if (rs != null) try { rs.close(); } catch(Exception e) {}
   if (stmt != null) try { stmt.close(); } catch(Exception e) {}
   if (con != null) try { con.close(); } catch(Exception e) {}
}
}
}















Oracle 12c Partial Indexes for Partitioned Tables

$
0
0

SQL> CREATE TABLE orders (
  2      order_id       NUMBER(12),
  3      order_date     DATE,
  4      order_mode     VARCHAR2(8),
  5      customer_id    NUMBER(6),
  6      order_status   NUMBER(2),
  7      order_total    NUMBER(8,2),
  8      sales_rep_id   NUMBER(6),
  9      promotion_id   NUMBER(6),
  10       CONSTRAINT order_pk PRIMARY KEY ( order_id )
  11   )
  12       INDEXING OFF
  13           PARTITION BY RANGE ( order_date ) (
  14           PARTITION P2004 VALUES LESS THAN (TO_DATE('01-JAN-2005','DD-MON-YYYY')) INDEXING OFF,
  15           PARTITION P2005 VALUES LESS THAN (TO_DATE('01-JAN-2006','DD-MON-YYYY')) INDEXING OFF,
  16           PARTITION P2006 VALUES LESS THAN (TO_DATE('01-JAN-2007','DD-MON-YYYY')) INDEXING OFF,
  17           PARTITION P2007 VALUES LESS THAN (TO_DATE('01-JAN-2008','DD-MON-YYYY')) INDEXING OFF,
  18           PARTITION P2008 VALUES LESS THAN (TO_DATE('01-JAN-2009','DD-MON-YYYY')) INDEXING ON,
  19           PARTITION P2009 VALUES LESS THAN (TO_DATE('01-JAN-2010','DD-MON-YYYY')) INDEXING ON
  20           )
  21  /

Table ORDERS created.

SQL> select order_date from oe.orders sample(10);
ORDER_DATE
-------------------------------
14-SEP-06 06.03.04.763452000 AM
17-NOV-06 01.22.11.262552000 AM
12-MAR-07 08.53.54.562432000 PM
29-MAR-07 03.41.20.945676000 PM
07-JUN-07 05.18.08.883310000 AM
16-AUG-07 02.34.12.234359000 PM
10-NOV-07 04.49.25.526321000 AM
27-FEB-08 03.41.45.109654000 AM
26-JUN-08 09.19.43.190089000 PM

9 rows selected.

SQL> insert into orders select * from oe.orders;
105 rows inserted.

SQL> commit;

Commit complete.

SQL> create index order_gi1 on orders (sales_rep_id) global indexing partial;
 

Index ORDER_GI1 created.

SQL> create index order_li1 on orders (customer_id) local indexing partial;

Index ORDER_LI1 created.

SQL> set sqlformat ansiconsole

SQL> select partition_name,indexing from user_tab_partitions where table_name='ORDERS';
PARTITION_NAME  INDEXING
P2004           OFF
P2005           OFF
P2006           OFF
P2007           OFF

P2008           ON
P2009           ON
6 rows selected.

SQL>  select index_name,partition_name,status from user_ind_partitions;
INDEX_NAME  PARTITION_NAME  STATUS
ORDER_LI1   P2004           UNUSABLE
ORDER_LI1   P2005           UNUSABLE
ORDER_LI1   P2006           UNUSABLE
ORDER_LI1   P2007           UNUSABLE

ORDER_LI1   P2008           USABLE
ORDER_LI1   P2009           USABLE


6 rows selected.


SQL> select index_name,indexing from user_indexes;
INDEX_NAME  INDEXING
ORDER_LI1   PARTIAL
ORDER_PK    FULL
ORDER_GI1   PARTIAL


SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT                                                                                                        
Plan hash value: 670661013                                                                                               
                                                                                                                          
--------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                    | Name      | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                             |           |     1 |    93 |   821   (1)| 00:00:01 |       |       |
|   1 |  VIEW                                        | VW_TE_2   |     2 |   186 |   821   (1)| 00:00:01 |       |       |
|   2 |  UNION-ALL                                  |           |       |       |            |          |       |       |
|*  3 |    TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| ORDERS    |     1 |    93 |     1   (0)| 00:00:01 | ROWID | ROWID |
|*  4 |     INDEX RANGE SCAN                         | ORDER_GI1 |     1 |       |     1   (0)| 00:00:01 |       |       |
|   5 |    PARTITION RANGE ITERATOR                  |           |     1 |    93 |   820   (1)| 00:00:01 |     1 |     4 |
|*  6 |     TABLE ACCESS FULL                        | ORDERS    |     1 |    93 |   820   (1)| 00:00:01 |     1 |     4 |
--------------------------------------------------------------------------------------------------------------------------
                
Predicate Information (identified by operation id):                                                                      
---------------------------------------------------                                  
   3 - filter("ORDERS"."ORDER_DATE">=TO_DATE(' 2008-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND        
              "ORDERS"."ORDER_DATE"<TO_DATE(' 2010-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))  
   4 - access("SALES_REP_ID"=154)                                                                    
   6 - filter("SALES_REP_ID"=154)  
       
Note                                                                                        
-----                                                                      
   - dynamic statistics used: dynamic sampling (level=2)                        

25 rows selected.



SQL> explain plan for select * from orders where customer_id=104;
Explained.


SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT                                                                                                        
Plan hash value: 4090115495                                                                                              
                                                                                                                          
--------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                    | Name      | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                             |           |     1 |    93 |   963   (1)| 00:00:01 |       |       |
|   1 |  VIEW                                        | VW_TE_2   |    17 |  1581 |   963   (1)| 00:00:01 |       |       |
|   2 |   UNION-ALL                                  |           |       |       |            |          |       |       |
|   3 |    PARTITION RANGE ITERATOR                  |           |    16 |  1488 |   143   (0)| 00:00:01 |     5 |     6 |
|   4 |     TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| ORDERS    |    16 |  1488 |   143   (0)| 00:00:01 |     5 |     6 |
|*  5 |      INDEX RANGE SCAN                        | ORDER_LI1 |    16 |       |     1   (0)| 00:00:01 |     5 |     6 |
|   6 |    PARTITION RANGE ITERATOR                  |           |     1 |    93 |   820   (1)| 00:00:01 |     1 |     4 |
|*  7 |     TABLE ACCESS FULL                        | ORDERS    |     1 |    93 |   820   (1)| 00:00:01 |     1 |     4 |
--------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):                                                                      
---------------------------------------------------                  
   5 - access("CUSTOMER_ID"=104)                  
   7 - filter("CUSTOMER_ID"=104)                 
                                                                                                                         
Note                                                                                                                     
-----              
   - dynamic statistics used: dynamic sampling (level=2) 


24 rows selected.

Display output vertically in SQLcl

$
0
0

image

image

image

var sql="";
for(var i=1;i<args.length;i++){
  sql = sql + " " + args[i];
}
ctx.write("\n\n SQL Statement: \n"+ sql + "\n\n");


var ret = util.executeReturnListofList(sql,null);

for (var i = 0; i < ret.size(); i++) {
    if (i ==0) { ctx.write('> SKIP HEADER ROW\n');continue; }
    else { ctx.write('>ROW \n') };
    for( var ii=0;ii<ret[i].size();ii++) {
        ctx.write("\t" + ret[0][ii] + " : " + ret[i][ii] + "\n");
    }
}

ctx.write('\n\n');


Port and firewall setting to enable Neo4j on Virtualbox

$
0
0

Prerequsites: to install Neo4j Communitiy edition, follow instructio: http://yum.neo4j.org/stable/

Below are the steps required to configure Linux RHEL 7 firewall and Virtualbox Port forwarding to make neo4j connectable from PC running virtualbox.

- uncomment highlight line in screenshot below:

image

- Adding firewall rules forn eo4j (bolt 7687, http 7474 and https 7473):

[root@vmxdb01 tmp]# firewall-cmd --permanent --zone=public --add-port=7474/tcp
success
[root@vmxdb01 tmp]# firewall-cmd --permanent --zone=public --add-port=7687/tcp
success

[root@vmxdb01 databases]# firewall-cmd --permanent --zone=public --add-port=7473/tcp
success


image

image

image

launch Neo4j from the PC client: http://127.0.0.1:7474

image

Stop auto yum package update in RHEL7/CentOS 7

$
0
0

[root@localhost ~]# yum update -y
Loaded plugins: fastestmirror, langpacks
Existing lock /var/run/yum.pid: another copy is running as pid 13905.
Another app is currently holding the yum lock; waiting for it to exit...
  The other application is: PackageKit
    Memory :  26 M RSS (429 MB VSZ)
    Started: Sat Oct 21 06:42:11 2017 - 00:44 ago
    State  : Sleeping, pid: 13905
Another app is currently holding the yum lock; waiting for it to exit...
  The other application is: PackageKit
    Memory :  26 M RSS (429 MB VSZ)
    Started: Sat Oct 21 06:42:11 2017 - 00:46 ago
     State  : Sleeping, pid: 13905


[root@localhost ~]# ps -ef|grep 14172
root     14172 11949 19 06:44 ?        00:00:03 /usr/bin/python /usr/share/PackageKit/helpers/yum/yumBackend.py update-packages only-trusted;only-download systemd-sysv;219-42.el7_4.4;x86_64;updates&seabios-<…..>
[root@localhost ~]# systemctl -l|grep package
packagekit.service                                                                       loaded active running   PackageKit Daemon

[root@localhost ~]# systemctl disable packagekit.service
[root@localhost ~]# systemctl stop packagekit.service


[root@localhost ~]# ps -ef|grep 14172
root     14282 13993  0 06:47 pts/1    00:00:00 grep --color=auto 14172

Custom installation CDH5 from Parcels error with permission issue

$
0
0

Scenarios: Choose YARN, HDFS with optional (Spark) during the installation

Error message:

2017-11-04 23:47:56,151 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager
org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: /var/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state/LOCK: Permission denied
        at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartRecoveryStore(NodeManager.java:181)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:245)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:562)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:609)
Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: /var/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state/LOCK: Permission denied
        at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)
        at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218)
        at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
        at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.openDatabase(NMLeveldbStateStoreService.java:944)
        at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.initStorage(NMLeveldbStateStoreService.java:931)
        at org.apache.hadoop.yarn.server.nodemanager.recovery.NMStateStoreService.serviceInit(NMStateStoreService.java:204)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        ... 5 more
2017-11-04 23:47:56,176 INFO org.apache.hadoop.yarn.server.nodemanager.NodeManager: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NodeManager at cdh-vm.dbaglobe.com/192.168.56.10

************************************************************/

Permission incorrect for CDH folders in /var/lib

image

How to fix:

chmod 755 /var/lib/accumulo
chmod 755 /var/lib/kafka
chmod 755 /var/lib/kudu
chmod 755 /var/lib/flume-ng
chmod 755 /var/lib/hadoop-hdfs
chmod 755 /var/lib/solr
chmod 755 /var/lib/zookeeper
chmod 755 /var/lib/llama
chmod 755 /var/lib/hadoop-httpfs
chmod 755 /var/lib/hadoop-mapreduce
chmod 755 /var/lib/sqoop
chmod 755 /var/lib/hadoop-kms
chmod 755 /var/lib/hive
chmod 755 /var/lib/sqoop2
chmod 755 /var/lib/oozie
chmod 755 /var/lib/hbase
chmod 755 /var/lib/sentry
chmod 755 /var/lib/impala
chmod 755 /var/lib/spark
chmod 755 /var/lib/hadoop-yarn

chown accumulo:accumulo /var/lib/accumulo
chown kafka:kafka /var/lib/kafka
chown kudu:kudu /var/lib/kudu
chown flume:flume /var/lib/flume-ng
chown hdfs:hdfs /var/lib/hadoop-hdfs
chown solr:solr /var/lib/solr
chown zookeeper:zookeeper /var/lib/zookeeper
chown llama:llama /var/lib/llama
chown httpfs:httpfs /var/lib/hadoop-httpfs
chown mapred:mapred /var/lib/hadoop-mapreduce
chown sqoop:sqoop /var/lib/sqoop
chown kms:kms /var/lib/hadoop-kms
chown hive:hive /var/lib/hive
chown sqoop2:sqoop2 /var/lib/sqoop2
chown oozie:oozie /var/lib/oozie
chown hbase:hbase /var/lib/hbase
chown sentry:sentry /var/lib/sentry
chown impala:impala /var/lib/impala
chown spark:spark /var/lib/spark
chown yarn:yarn /var/lib/hadoop-yarn

image

Install ipython on CentOS7 / Redhat EL 7

$
0
0

[root@cdh-vm ~]# yum install gcc python-devel python-setuptools

[root@cdh-vm ~]# rpm -ql python-setuptools.noarch |grep easy_install.py
/usr/lib/python2.7/site-packages/easy_install.py
/usr/lib/python2.7/site-packages/easy_install.pyc
/usr/lib/python2.7/site-packages/easy_install.pyo
/usr/lib/python2.7/site-packages/setuptools/command/easy_install.py
/usr/lib/python2.7/site-packages/setuptools/command/easy_install.pyc
/usr/lib/python2.7/site-packages/setuptools/command/easy_install.pyo


[root@cdh-vm ~]# python /usr/lib/python2.7/site-packages/easy_install.py pip
Searching for pip
Reading https://pypi.python.org/simple/pip/
Best match: pip 9.0.1
Downloading https://pypi.python.org/packages/11/b6/abcb525026a4be042b486df43905d6893fb04f05aac21c32c638e939e447/pip-9.0.1.tar.gz#md5=35f01da33009719497f01a4ba69d63c9
Processing pip-9.0.1.tar.gz
Writing /tmp/easy_install-Cjz0xZ/pip-9.0.1/setup.cfg
Running pip-9.0.1/setup.py -q bdist_egg --dist-dir /tmp/easy_install-Cjz0xZ/pip-9.0.1/egg-dist-tmp-G6YcYa
/usr/lib64/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'python_requires'
  warnings.warn(msg)
warning: no previously-included files found matching '.coveragerc'
warning: no previously-included files found matching '.mailmap'
warning: no previously-included files found matching '.travis.yml'
warning: no previously-included files found matching '.landscape.yml'
warning: no previously-included files found matching 'pip/_vendor/Makefile'
warning: no previously-included files found matching 'tox.ini'
warning: no previously-included files found matching 'dev-requirements.txt'
warning: no previously-included files found matching 'appveyor.yml'
no previously-included directories found matching '.github'
no previously-included directories found matching '.travis'
no previously-included directories found matching 'docs/_build'
no previously-included directories found matching 'contrib'
no previously-included directories found matching 'tasks'
no previously-included directories found matching 'tests'
Adding pip 9.0.1 to easy-install.pth file
Installing pip script to /usr/bin
Installing pip2.7 script to /usr/bin
Installing pip2 script to /usr/bin

Installed /usr/lib/python2.7/site-packages/pip-9.0.1-py2.7.egg
Processing dependencies for pip
Finished processing dependencies for pip


[root@cdh-vm ~]# pip install ipython
Collecting ipython
  Using cached ipython-5.5.0-py2-none-any.whl
Requirement already satisfied: prompt-toolkit<2.0.0,>=1.0.4 in /usr/lib/python2.7/site-packages (from ipython)
Requirement already satisfied: setuptools>=18.5 in /usr/lib/python2.7/site-packages (from ipython)
Requirement already satisfied: pexpect; sys_platform != "win32" in /usr/lib/python2.7/site-packages (from ipython)
Requirement already satisfied: backports.shutil-get-terminal-size; python_version == "2.7" in /usr/lib/python2.7/site-packages (from ipython)
Requirement already satisfied: decorator in /usr/lib/python2.7/site-packages (from ipython)
Requirement already satisfied: pygments in /usr/lib64/python2.7/site-packages (from ipython)
Collecting pathlib2; python_version == "2.7" or python_version == "3.3" (from ipython)
  Using cached pathlib2-2.3.0-py2.py3-none-any.whl
Collecting traitlets>=4.2 (from ipython)
  Using cached traitlets-4.3.2-py2.py3-none-any.whl
Collecting simplegeneric>0.8 (from ipython)
  Using cached simplegeneric-0.8.1.zip
Collecting pickleshare (from ipython)
  Using cached pickleshare-0.7.4-py2.py3-none-any.whl
Requirement already satisfied: wcwidth in /usr/lib/python2.7/site-packages (from prompt-toolkit<2.0.0,>=1.0.4->ipython)
Requirement already satisfied: six>=1.9.0 in /usr/lib/python2.7/site-packages (from prompt-toolkit<2.0.0,>=1.0.4->ipython)
Requirement already satisfied: ptyprocess>=0.5 in /usr/lib/python2.7/site-packages (from pexpect; sys_platform != "win32"->ipython)
Collecting scandir; python_version < "3.5" (from pathlib2; python_version == "2.7" or python_version == "3.3"->ipython)
  Using cached scandir-1.6.tar.gz
Collecting ipython-genutils (from traitlets>=4.2->ipython)
  Using cached ipython_genutils-0.2.0-py2.py3-none-any.whl
Collecting enum34; python_version == "2.7" (from traitlets>=4.2->ipython)
  Using cached enum34-1.1.6-py2-none-any.whl
Installing collected packages: scandir, pathlib2, ipython-genutils, enum34, traitlets, simplegeneric, pickleshare, ipython
  Running setup.py install for scandir ... done
  Running setup.py install for simplegeneric ... done
Successfully installed enum34-1.1.6 ipython-5.5.0 ipython-genutils-0.2.0 pathlib2-2.3.0 pickleshare-0.7.4 scandir-1.6 simplegeneric-0.8.1 traitlets-4.3.2

Configure Cloudera Keberos Authentication using MTI-KDC

$
0
0

- Create keberos admin user who has privivleges to to add other principals
[root@cdh-vm krb5kdc]# kadmin.local  -q "addprinc cloudera-scm/admin"

-- Before proceed, verify the KDC works:

ktutil:  add_entry -password -p cloudera-scm/admin -k 1 -e aes256-cts-hmac-sha1-96
Password for cloudera-scm/admin@DBAGLOBE.COM:

[root@cdh-vm log]#  klist -e
Ticket cache: KEYRING:persistent:0:krb_ccache_r0tnzhY
Default principal: cloudera-scm/admin@DBAGLOBE.COM

Valid starting       Expires              Service principal
11/09/2017 23:09:42  11/10/2017 23:09:42  krbtgt/DBAGLOBE.COM@DBAGLOBE.COM
        Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96

ScreenHunter 1793ScreenHunter 1795ScreenHunter 1798ScreenHunter 1804ScreenHunter 1802ScreenHunter 1803

Encountered errors:

2017-11-09 23:34:18,010 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 for port 8022: readAndProcess from client 192.168.56.10 threw exception [javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API level (Mechanism level: Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled)]]
2017-11-09 23:34:18,655 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30002 milliseconds
2017-11-09 23:34:18,656 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 0 millisecond(s).
2017-11-09 23:34:22,804 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 for port 8022: readAndProcess from client 192.168.56.10 threw exception [javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API level (Mechanism level: Encryption type AES256 CTS mode with HMAC SHA1-96 is not support

How to Fix:

[root@cdh-vm security]# pwd
/usr/java/jdk1.8.0_144/jre/lib/security

[root@cdh-vm security]# mkdir limited
[root@cdh-vm security]# mv *.jar limited/


[root@cdh-vm security]# unzip /home/donghua/jce_policy-8.zip -d  /home/donghua/
[root@cdh-vm security]# cp /home/donghua/UnlimitedJCEPolicyJDK8/*.jar .

Install Cloudera Manager and CDH using local repository

$
0
0

Step 1: install prerequsite packages:

yum install yum-utils createrepo httpd

Step 2: prepare httpd server

>>>> /etc/httpd/conf.d/cloudera.conf

Alias "/cm" "/repo/cm"
< Directory "/repo/cm">
    Options Indexes FollowSymLinks
    AllowOverride None
    Require all granted
< /Directory>
Alias "/cdh" "/repo/cdh"
< Directory "/repo/cdh">
    Options Indexes FollowSymLinks
    AllowOverride None
    Require all granted

Step 3: download CM and parcels into respective folder.

https://archive.cloudera.com/cm5/redhat/7/x86_64/cm/5.13.0/RPMS/

https://archive.cloudera.com/cdh5/parcels/5.13/

Step 4: Setup local repo

[root@cdh-vm ~]# createrepo /repo/cm/

>>>>>>> /etc/yum.repos.d/cloudera-local.repo
[cloudera-local]
name=cloudera-local
baseurl=http://cdh-vm/cm/
gpgcheck=0
enabled=1

Step 5: Install Cloudera Manager

[root@cdh-vm cm]# yum install cloudera-manager-server cloudera-manager-daemons

Step 6: Configure CM using mysql

[root@cdh-vm ~]# /usr/share/cmf/schema/scm_prepare_database.sh mysql cmserver cmserver password

Step 7: login http://cdh-vm.dbaglobe.com:7180 to setup cluster

ScreenHunter 1812

ScreenHunter 1813

RDD Lineage and Persistence

$
0
0

Example 1: Without Persistence

>>> hs=sc.textFile('hdfs://cdh-vm/user/donghua/hadoopsecurity.txt')
>>> rdd1=hs.map(lambda line:line.upper())
>>> rdd2=rdd1.filter(lambda line:line.startswith('E'))
>>> rdd2.collect()
[u'ENABLING KERBEROS AUTHENTICATION USING CLOUDERA MANAGER']

>>> print rdd2.toDebugString()
(2) PythonRDD[15] at collect at <stdin>:1 []
  |  hdfs://cdh-vm/user/donghua/hadoopsecurity.txt MapPartitionsRDD[14] at textFile at NativeMethodAccessorImpl.java:-2 []
  |  hdfs://cdh-vm/user/donghua/hadoopsecurity.txt HadoopRDD[13] at textFile at NativeMethodAccessorImpl.java:-2 []

Example 2: With default Persistence for RDD1


>>> hs=sc.textFile('hdfs://cdh-vm/user/donghua/hadoopsecurity.txt')
> >> rdd1=hs.map(lambda line:line.upper())
>>> rdd1.persist()
PythonRDD[18] at RDD at PythonRDD.scala:43
>>> rdd2=rdd1.filter(lambda line:line.startswith('E'))
>>> rdd2.collect()
[u'ENABLING KERBEROS AUTHENTICATION USING CLOUDERA MANAGER']


>>> print rdd2.toDebugString()
(2) PythonRDD[19] at collect at <stdin>:1 []
|  PythonRDD[18] at RDD at PythonRDD.scala:43 []
  |      CachedPartitions: 2; MemorySize: 577.0 B; ExternalBlockStoreSize: 0.0 B; DiskSize: 0.0 B
  |  hdfs://cdh-vm/user/donghua/hadoopsecurity.txt MapPartitionsRDD[17] at textFile at NativeMethodAccessorImpl.java:-2 []
  |  hdfs://cdh-vm/user/donghua/hadoopsecurity.txt HadoopRDD[16] at textFile at NativeMethodAccessorImpl.java:-2 []

Example 3: With default Persistence for HS, RDD1 and Memory_and_disk Persistence for RDD2

>>> from pyspark import StorageLevel
>>> hs=sc.textFile('hdfs://cdh-vm/user/donghua/hadoopsecurity.txt')
>>> hs.persist()
hdfs://cdh-vm/user/donghua/hadoopsecurity.txt MapPartitionsRDD[25] at textFile at NativeMethodAccessorImpl.java:-2
>>> rdd1=hs.map(lambda line:line.upper())
>>> rdd1.persist()
PythonRDD[26] at RDD at PythonRDD.scala:43
>>> rdd2=rdd1.filter(lambda line:line.startswith('E'))
>>> rdd2.persist(StorageLevel.MEMORY_AND_DISK)
PythonRDD[27] at RDD at PythonRDD.scala:43

>>> rdd2.collect()
[u'ENABLING KERBEROS AUTHENTICATION USING CLOUDERA MANAGER']
>>> print rdd2.toDebugString()
(2) PythonRDD[27] at RDD at PythonRDD.scala:43 [Disk Memory Deserialized 1x Replicated]
  |       CachedPartitions: 2; MemorySize: 128.0 B; ExternalBlockStoreSize: 0.0 B; DiskSize: 0.0 B
  |  PythonRDD[26] at RDD at PythonRDD.scala:43 [Disk Memory Deserialized 1x Replicated]
  |      CachedPartitions: 2; MemorySize: 577.0 B; ExternalBlockStoreSize: 0.0 B; DiskSize: 0.0 B
  |  hdfs://cdh-vm/user/donghua/hadoopsecurity.txt MapPartitionsRDD[25] at textFile at NativeMethodAccessorImpl.java:-2 [Disk Memory Deserialized 1x Replicated]
  |      CachedPartitions: 2; MemorySize: 491.0 B; ExternalBlockStoreSize: 0.0 B; DiskSize: 0.0 B
  |  hdfs://cdh-vm/user/donghua/hadoopsecurity.txt HadoopRDD[24] at textFile at NativeMethodAccessorImpl.java:-2 [Disk Memory Deserialized 1x Replicated]


Using ipython in pyspark

$
0
0

Here is the link for ipython installation: http://www.dbaglobe.com/2017/11/install-ipython-on-centos7-redhat-el-7.html


If you use Spark < 1.2 you can simply execute bin/pyspark with an environmental variable IPYTHON=1.

IPYTHON=1 /usr/bin/pyspark


or

export IPYTHON=1
/usr/bin/pyspark

While above will still work on the Spark 1.2 and above recommended way to set Python environment for these versions is PYSPARK_DRIVER_PYTHON


PYSPARK_DRIVER_PYTHON=ipython /usr/bin/pyspark

or

export PYSPARK_DRIVER_PYTHON=ipython
/usr/bin/pyspark

image

Install and configure jupter for pyspark

$
0
0

[root@cdh-vm ~]# pip install jupyter


[donghua@cdh-vm ~]$ grep PYSPARK  ~/.bash_profile
export PYSPARK_DRIVER_PYTHON=ipython
export PYSPARK_DRIVER_PYTHON_OPTS="notebook --ip 192.168.56.10 --port 3333 --no-mathjax"

[donghua@cdh-vm ~]$ pyspark
[TerminalIPythonApp] WARNING | Subcommand `ipython notebook` is deprecated and will be removed in future versions.
[TerminalIPythonApp] WARNING | You likely want to use `jupyter notebook` in the future
[I 13:34:34.137 NotebookApp] Serving notebooks from local directory: /home/donghua
[I 13:34:34.137 NotebookApp] 0 active kernels
[I 13:34:34.137 NotebookApp] The Jupyter Notebook is running at:
[I 13:34:34.137 NotebookApp] http://192.168.56.10:3333/?token=3a127150007c4f0816871644a97feb3b2c1ad721411e9576
[I 13:34:34.137 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[W 13:34:34.138 NotebookApp] No web browser found: could not locate runnable browser.
[C 13:34:34.138 NotebookApp]

    Copy/paste this URL into your browser when you connect for the first time,
    to login with a token:
        http://192.168.56.10:3333/?token=3a127150007c4f0816871644a97feb3b2c1ad721411e9576

image

================================================

[root@cdh-vm ~]# pip install jupyter
Collecting jupyter
  Downloading jupyter-1.0.0-py2.py3-none-any.whl
Collecting nbconvert (from jupyter)
  Downloading nbconvert-5.3.1-py2.py3-none-any.whl (387kB)
    100% |████████████████████████████████| 389kB 723kB/s
Collecting ipywidgets (from jupyter)
  Downloading ipywidgets-7.0.5-py2.py3-none-any.whl (68kB)
    100% |████████████████████████████████| 71kB 3.2MB/s
Collecting notebook (from jupyter)
  Downloading notebook-5.2.1-py2.py3-none-any.whl (8.0MB)
    100% |████████████████████████████████| 8.0MB 126kB/s
Collecting qtconsole (from jupyter)
  Downloading qtconsole-4.3.1-py2.py3-none-any.whl (108kB)
    100% |████████████████████████████████| 112kB 4.3MB/s
Collecting jupyter-console (from jupyter)
  Downloading jupyter_console-5.2.0-py2.py3-none-any.whl
Collecting ipykernel (from jupyter)
  Downloading ipykernel-4.6.1-py2-none-any.whl (104kB)
    100% |████████████████████████████████| 112kB 5.6MB/s
Collecting pandocfilters>=1.4.1 (from nbconvert->jupyter)
  Downloading pandocfilters-1.4.2.tar.gz
Collecting entrypoints>=0.2.2 (from nbconvert->jupyter)
  Downloading entrypoints-0.2.3-py2.py3-none-any.whl
Collecting jinja2 (from nbconvert->jupyter)
  Downloading Jinja2-2.10-py2.py3-none-any.whl (126kB)
    100% |████████████████████████████████| 133kB 3.6MB/s
Collecting testpath (from nbconvert->jupyter)
  Downloading testpath-0.3.1-py2.py3-none-any.whl (161kB)
    100% |████████████████████████████████| 163kB 2.6MB/s
Collecting mistune>=0.7.4 (from nbconvert->jupyter)
  Downloading mistune-0.8.1-py2.py3-none-any.whl
Collecting nbformat>=4.4 (from nbconvert->jupyter)
  Downloading nbformat-4.4.0-py2.py3-none-any.whl (155kB)
    100% |████████████████████████████████| 163kB 2.9MB/s
Requirement already satisfied: pygments in /usr/lib64/python2.7/site-packages (from nbconvert->jupyter)
Collecting bleach (from nbconvert->jupyter)
  Downloading bleach-2.1.1-py2.py3-none-any.whl
Requirement already satisfied: traitlets>=4.2 in /usr/lib/python2.7/site-packages (from nbconvert->jupyter)
Collecting jupyter-core (from nbconvert->jupyter)
  Downloading jupyter_core-4.4.0-py2.py3-none-any.whl (126kB)
    100% |████████████████████████████████| 133kB 3.5MB/s
Requirement already satisfied: ipython<6.0.0,>=4.0.0; python_version < "3.3" in /usr/lib/python2.7/site-packages (from ipywidgets->jupyter)
Collecting widgetsnbextension~=3.0.0 (from ipywidgets->jupyter)
  Downloading widgetsnbextension-3.0.8-py2.py3-none-any.whl (2.2MB)
    100% |████████████████████████████████| 2.2MB 347kB/s
Requirement already satisfied: ipython-genutils in /usr/lib/python2.7/site-packages (from notebook->jupyter)
Collecting jupyter-client (from notebook->jupyter)
  Downloading jupyter_client-5.1.0-py2.py3-none-any.whl (84kB)
    100% |████████████████████████████████| 92kB 2.8MB/s
Collecting tornado>=4 (from notebook->jupyter)
  Downloading tornado-4.5.2.tar.gz (483kB)
    100% |████████████████████████████████| 491kB 1.8MB/s
Collecting terminado>=0.3.3; sys_platform != "win32" (from notebook->jupyter)
  Downloading terminado-0.7-py2.py3-none-any.whl
Requirement already satisfied: prompt-toolkit<2.0.0,>=1.0.0 in /usr/lib/python2.7/site-packages (from jupyter-console->jupyter)
Collecting configparser>=3.5; python_version == "2.7" (from entrypoints>=0.2.2->nbconvert->jupyter)
  Downloading configparser-3.5.0.tar.gz
Collecting MarkupSafe>=0.23 (from jinja2->nbconvert->jupyter)
  Downloading MarkupSafe-1.0.tar.gz
Collecting jsonschema!=2.5.0,>=2.4 (from nbformat>=4.4->nbconvert->jupyter)
  Downloading jsonschema-2.6.0-py2.py3-none-any.whl
Collecting html5lib!=1.0b1,!=1.0b2,!=1.0b3,!=1.0b4,!=1.0b5,!=1.0b6,!=1.0b7,!=1.0b8,>=0.99999999pre (from bleach->nbconvert->jupyter)
  Downloading html5lib-1.0b10-py2.py3-none-any.whl (112kB)
    100% |████████████████████████████████| 112kB 3.7MB/s
Requirement already satisfied: six in /usr/lib/python2.7/site-packages (from bleach->nbconvert->jupyter)
Requirement already satisfied: decorator in /usr/lib/python2.7/site-packages (from traitlets>=4.2->nbconvert->jupyter)
Requirement already satisfied: enum34; python_version == "2.7" in /usr/lib/python2.7/site-packages (from traitlets>=4.2->nbconvert->jupyter)
Requirement already satisfied: pexpect; sys_platform != "win32" in /usr/lib/python2.7/site-packages (from ipython<6.0.0,>=4.0.0; python_version < "3.3"->ipywidgets->jupyter)
Requirement already satisfied: backports.shutil-get-terminal-size; python_version == "2.7" in /usr/lib/python2.7/site-packages (from ipython<6.0.0,>=4.0.0; python_version < "3.3"->ipywidgets->jupyter)
Requirement already satisfied: setuptools>=18.5 in /usr/lib/python2.7/site-packages (from ipython<6.0.0,>=4.0.0; python_version < "3.3"->ipywidgets->jupyter)
Requirement already satisfied: pathlib2; python_version == "2.7" or python_version == "3.3" in /usr/lib/python2.7/site-packages (from ipython<6.0.0,>=4.0.0; python_version < "3.3"->ipywidgets->jupyter)
Requirement already satisfied: simplegeneric>0.8 in /usr/lib/python2.7/site-packages (from ipython<6.0.0,>=4.0.0; python_version < "3.3"->ipywidgets->jupyter)
Requirement already satisfied: pickleshare in /usr/lib/python2.7/site-packages (from ipython<6.0.0,>=4.0.0; python_version < "3.3"->ipywidgets->jupyter)
Collecting pyzmq>=13 (from jupyter-client->notebook->jupyter)
  Downloading pyzmq-16.0.3-cp27-cp27mu-manylinux1_x86_64.whl (3.0MB)
    100% |████████████████████████████████| 3.0MB 299kB/s
Collecting python-dateutil>=2.1 (from jupyter-client->notebook->jupyter)
  Downloading python_dateutil-2.6.1-py2.py3-none-any.whl (194kB)
    100% |████████████████████████████████| 194kB 2.6MB/s
Requirement already satisfied: backports.ssl_match_hostname in /usr/lib/python2.7/site-packages (from tornado>=4->notebook->jupyter)
Collecting singledispatch (from tornado>=4->notebook->jupyter)
  Downloading singledispatch-3.4.0.3-py2.py3-none-any.whl
Collecting certifi (from tornado>=4->notebook->jupyter)
  Downloading certifi-2017.11.5-py2.py3-none-any.whl (330kB)
    100% |████████████████████████████████| 337kB 596kB/s
Collecting backports_abc>=0.4 (from tornado>=4->notebook->jupyter)
  Downloading backports_abc-0.5-py2.py3-none-any.whl
Requirement already satisfied: ptyprocess in /usr/lib/python2.7/site-packages (from terminado>=0.3.3; sys_platform != "win32"->notebook->jupyter)
Requirement already satisfied: wcwidth in /usr/lib/python2.7/site-packages (from prompt-toolkit<2.0.0,>=1.0.0->jupyter-console->jupyter)
Collecting functools32; python_version == "2.7" (from jsonschema!=2.5.0,>=2.4->nbformat>=4.4->nbconvert->jupyter)
  Downloading functools32-3.2.3-2.zip
Collecting webencodings (from html5lib!=1.0b1,!=1.0b2,!=1.0b3,!=1.0b4,!=1.0b5,!=1.0b6,!=1.0b7,!=1.0b8,>=0.99999999pre->bleach->nbconvert->jupyter)
  Downloading webencodings-0.5.1-py2.py3-none-any.whl
Requirement already satisfied: scandir; python_version < "3.5" in /usr/lib64/python2.7/site-packages (from pathlib2; python_version == "2.7" or python_version == "3.3"->ipython<6.0.0,>=4.0.0; python_version < "3.3"->ipywidgets->jupyter)
Installing collected packages: pandocfilters, configparser, entrypoints, MarkupSafe, jinja2, testpath, mistune, functools32, jsonschema, jupyter-core, nbformat, webencodings, html5lib, bleach, nbconvert, pyzmq, python-dateutil, jupyter-client, singledispatch, certifi, backports-abc, tornado, terminado, ipykernel, notebook, widgetsnbextension, ipywidgets, qtconsole, jupyter-console, jupyter
  Running setup.py install for pandocfilters ... done
  Running setup.py install for configparser ... done
  Running setup.py install for MarkupSafe ... done
  Running setup.py install for functools32 ... done
  Running setup.py install for tornado ... done
Successfully installed MarkupSafe-1.0 backports-abc-0.5 bleach-2.1.1 certifi-2017.11.5 configparser-3.5.0 entrypoints-0.2.3 functools32-3.2.3.post2 html5lib-1.0b10 ipykernel-4.6.1 ipywidgets-7.0.5 jinja2-2.10 jsonschema-2.6.0 jupyter-1.0.0 jupyter-client-5.1.0 jupyter-console-5.2.0 jupyter-core-4.4.0 mistune-0.8.1 nbconvert-5.3.1 nbformat-4.4.0 notebook-5.2.1 pandocfilters-1.4.2 python-dateutil-2.6.1 pyzmq-16.0.3 qtconsole-4.3.1 singledispatch-3.4.0.3 terminado-0.7 testpath-0.3.1 tornado-4.5.2 webencodings-0.5.1 widgetsnbextension-3.0.8
[root@cdh-vm ~]#

Installing or Upgrading Cloudera Distribution of Apache Spark 2

$
0
0

Referece URLs:

https://www.cloudera.com/documentation/spark2/latest/topics/spark2_installing.html

https://www.cloudera.com/documentation/enterprise/latest/topics/cm_mc_addon_services.html

https://www.cloudera.com/documentation/spark2/latest/topics/spark2_packaging.html


[root@cdh-vm csd]# wget http://archive.cloudera.com/spark2/csd/SPARK2_ON_YARN-2.2.0.cloudera1.jar -O /opt/cloudera/csd/SPARK2_ON_YARN-2.2.0.cloudera1.jar

[root@cdh-vm csd]# chown cloudera-scm:cloudera-scm  /opt/cloudera/csd/SPARK2_ON_YARN-2.2.0.cloudera1.jar
[root@cdh-vm csd]# chmod 644 /opt/cloudera/csd/SPARK2_ON_YARN-2.2.0.cloudera1.jar
[root@cdh-vm csd]# ls -l /opt/cloudera/csd/SPARK2_ON_YARN-2.2.0.cloudera1.jar
-rw-r--r-- 1 cloudera-scm cloudera-scm 17240 Jul 13 10:17 /opt/cloudera/csd/SPARK2_ON_YARN-2.2.0.cloudera1.jar

[root@cdh-vm csd]# /etc/init.d/cloudera-scm-server restart

image

image

- Download, Distrubute and Activated the SPARK2 parcels.

image

- Continue to add the Spark2 Service

image

image

image

image

image

image

[donghua@cdh-vm ~]$ pyspark2
Python 2.7.5 (default, Aug  4 2017, 00:39:18)
Type "copyright", "credits" or "license" for more information.

IPython 5.5.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 2.2.0.cloudera1
      /_/

Using Python version 2.7.5 (default, Aug  4 2017 00:39:18)
SparkSession available as 'spark'.

In [1]: t=spark.read.text('/user/donghua/hadoopsecurity.txt')
17/11/19 08:35:44 WARN streaming.FileStreamSink: Error while looking for metadata directory.

In [2]: t.collect()
Out[2]:
[Row(value=u'Kerberos Principals and Keytabs'),
  Row(value=u'Why Use Cloudera Manager to Implement Hadoop Security?'),
  Row(value=u'Enabling Kerberos Authentication Using Cloudera Manager'),
  Row(value=u'Viewing and Regenerating Kerberos Principals'),
  Row(value=u'Configuring LDAP Group Mappings'),
  Row(value=u'Mapping Kerberos Principals to Short Names'),
  Row(value=u'Troubleshooting Kerberos Security Issues'),
  Row(value=u'Known Kerberos Issues in Cloudera Manager'),
  Row(value=u'Appendix A - Manually Configuring Kerberos Using Cloudera Manager'),
  Row(value=u'Appendix B - Set up a Cluster-dedicated MIT KDC and Default Domain for the Hadoop Cluster'),
  Row(value=u'Appendix C - Hadoop Users in Cloudera Manager')]

Download RPM and Dependencies without actual installation

$
0
0

[root@localhost ~]# yum install --downloadonly --downloaddir /root/postgresql postgresql10 postgresql10-server
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
  * base: mirror.qoxy.com
  * extras: centos.ipserverone.com
  * updates: centos.ipserverone.com
Resolving Dependencies
--> Running transaction check
---> Package postgresql10.x86_64 0:10.1-1PGDG.rhel7 will be installed
--> Processing Dependency: postgresql10-libs(x86-64) = 10.1-1PGDG.rhel7 for package: postgresql10-10.1-1PGDG.rhel7.x86_64
--> Processing Dependency: libicu for package: postgresql10-10.1-1PGDG.rhel7.x86_64
--> Processing Dependency: libpq.so.5()(64bit) for package: postgresql10-10.1-1PGDG.rhel7.x86_64
---> Package postgresql10-server.x86_64 0:10.1-1PGDG.rhel7 will be installed
--> Running transaction check
---> Package libicu.x86_64 0:50.1.2-15.el7 will be installed
---> Package postgresql10-libs.x86_64 0:10.1-1PGDG.rhel7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
  Package                         Arch               Version                      Repository                          Size
================================================================================
Installing:
  postgresql10                    x86_64             10.1-1PGDG.rhel7             pgdg10-updates-testing             1.5 M
  postgresql10-server             x86_64             10.1-1PGDG.rhel7             pgdg10-updates-testing             4.3 M
Installing for dependencies:
  libicu                          x86_64             50.1.2-15.el7                base                               6.9 M
  postgresql10-libs               x86_64             10.1-1PGDG.rhel7             pgdg10-updates-testing             347 k

Transaction Summary
====================================================================================
Install  2 Packages (+2 Dependent packages)

Total download size: 13 M
Installed size: 49 M
Background downloading packages, then exiting:
(1/4): libicu-50.1.2-15.el7.x86_64.rpm                                                             | 6.9 MB  00:00:00
(2/4): postgresql10-libs-10.1-1PGDG.rhel7.x86_64.rpm                                               | 347 kB  00:00:03
(3/4): postgresql10-10.1-1PGDG.rhel7.x86_64.rpm                                                    | 1.5 MB  00:00:05
(4/4): postgresql10-server-10.1-1PGDG.rhel7.x86_64.rpm                                             | 4.3 MB  00:00:06
--------------------------------------------------------------------------------------------------------------------------
Total                                                                                     1.3 MB/s |  13 MB  00:00:10
exiting because "Download Only" specified

Change Password for Kerberos (admin) user

$
0
0

-- change password

[root@cdh-vm postgresql]# kadmin cpw cloudera-scm/admin@DBAGLOBE.COM
Password for cloudera-scm/admin@DBAGLOBE.COM:
Enter password for principal "cloudera-scm/admin@DBAGLOBE.COM":
Re-enter password for principal "cloudera-scm/admin@DBAGLOBE.COM":

[root@cdh-vm postgresql]# kinit cloudera-scm/admin@DBAGLOBE.COM
Password for cloudera-scm/admin@DBAGLOBE.COM:

[root@cdh-vm postgresql]#  klist -e
Ticket cache: KEYRING:persistent:0:0
Default principal: cloudera-scm/admin@DBAGLOBE.COM

Valid starting       Expires              Service principal
12/01/2017 05:32:33  12/02/2017 05:32:33  krbtgt/DBAGLOBE.COM@DBAGLOBE.COM
        Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96

Viewing all 604 articles
Browse latest View live